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For differential psychology to be of much value it is not sufficient 
for it to determine present differences among individuals. It must be 
able to forecast them. And it must be able not only to forecast differ- 
ences among individuals; it must be able to forecast differences in the 
efficiency of the same individual in different kinds of endeavor, which 
is quite different. It is fairly clear, for example, that a scientific 
vocational guidance will never be realized until we are able to forecast 
for each individual on some uniform scale the various important type 
aptitudes in such a way as to indicate, at least roughly, how much more 
efficient he is likely to be in a given vocation than in any one of numer- 
ous others. Possibly such an ideal is incapable of realization but this 
naturally can only be determined by trial. It was primarily with the 
purpose of exploring the possibilities of such differential prediction that 
the present investigation was undertaken. 


I 


Consider any four fairly distinct vocations, such as life insurance 
salesman, watch repair man, commercial research chemist, pulpit 
orator. It is not improbable that for each one of these vocations 
a test battery could be devised which would yield aptitude forecasts 





1 From the Psychological Laboratory, University of Wisconsin. The writers 
are greatly indebted to Margaret V. Klein and Ruth J. Eken for the painstaking 
computations involved in the determination of the regression equations for algebra 
and English, respectively. 
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which would correlate to the extent of .70 or better with actual effi- 
ciency of people in the particular vocation. But it is hardly likely 
that such a measure of the actual capacity of any one individual on al] 
four vocations would ever be obtainable because life is too short for 
the maximum attainment to be reached in so many vocations even 
if any one should ever seriously attempt to follow all of this particular 
combination. It is quite inconceivable that adequate and comparable 
criterion measurements on each of four such vocations should ever be 
obtainable on each of 100 or more individuals. Yet without some such 
related criterion data as suggested, it will be impossible to determine 
even for a few vocations, the accuracy with which test batteries 
differentiate between the various aptitudes of single individuals. 
Largely because of such obvious difficulties in the case of genuine voca- 
tions but partly because of the intrinsic importance of being able to 
forecast special abilities and disabilities in the learning of various 
school subjects, the aptitudes considered in the present investigation 
were taken from the field of education. The four aptitudes investi- 
gated were: Shorthand, typewriting, high school English and high 
school algebra. 

Elsewhere one of the writers! will soon have published a full account 
of the investigation as related to shorthand and typewriting together 
with a description of the special experimental and statistical methods 
employed. That study should accordingly be consulted by the 
interested reader for a description of the tests, the conditions under 
which they were given as well as for the results with the two aptitudes 
there reported. It will suffice in this place merely to state that high 
school freshmen who were beginning the study of stenography and 
typewriting were given 40 group tests with a view to selecting from 
this number a battery which would yield useful forecasts of the 
aptitudes in question. Of the students thus tested, 107 came 
through with test and criterion records complete in every detail. 
These scores were used in the computation of the correlation coeffi- 
cients upon the basis of which the two batteries of tests were selected 
and the regression coefficients or weights given to each test determined. 

Of the 107 subjects just mentioned, 73 were at the same time 
beginning the study of high school English and algebra. Accordingly 
at the end of the year the marks in English and algebra for the 73 





1Limp, Charles E.: A Prognostic Test for Shorthand and Typewriting of 
First Year High School Students. 
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freshmen were compiled from the school records. As a result, in 
addition to a perfectly complete set of test scores of wide range, there 
was then available for each of these 73 subjects, four distinct aptitude 
scores—one each for shorthand, typewriting, English and algebra. 
This made possible (1) the construction of two new test batteries and 
(2) a means of testing the differential prognostic power of four test- 
batteries, involving six differences in actual aptitudes. 


II 


The greater part of the 40 tests chosen for the preliminary try-out 
came from three well known batteries of group tests: Terman’s Group 
Test of Mental Ability, Hoke’s Group Prognostic Test of Stenographic 
Ability and Downey’s Group Will-temperament Test.! There were in 
addition several other group tests which, for one reason or another, 
seemed promising. The main idea was to secure a wide range of group 
tests as different from each other as possible, to the end that samples 
of behavior might be secured of the determiners of human potentiali- 
ties from as broad a zone as possible. Correlations were computed not 
only between the criteria and the various tests making up the three 
batteries, but also between the criteria and the composite score of 
each battery as a whole, in the latter case the various parts being 
weighted as recommended by the respective authors. In these com- 
putations, however, 6 of the original 40 tests employed by Limp 
were discarded as his study had shown that for various reasons these 
particular tests were too unpromising to merit further attention. The 
list of tests finally chosen for investigation together with the correla- 
tion of each with each of the two criteria, are shown in Table I. The 
small figures 1, 2, 3, or 4, standing before certain of the tests indicate 
that the tests in question found a place in the batteries of (1) shorthand, 
(2) typewriting, (3) English and (4) algebra, respectively. 

One of the striking things revealed by Table I, is the fact that 
Hoke’s complete battery designed for prognosis of shorthand aptitude 
has a much higher efficiency in forecasting aptitudes in both English 
and algebra than has Terman’s battery. Hoke’s battery yields 
correlations of .56 with English and .55 with algebra, whereas, Ter- 
man’s battery yields .42 and .32 for the respective aptitudes. 





1 Terman’s and Downey’s Tests are published by the World Book Co. Hoke’s 
tests may be secured from the author, Elmer R. Hoke, Annville, Pa. 
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TasLe I.—SHOWING THE TrEesTs EMPLOYED, THE CORRELATION OF Each wits 


EacH OF THE CRITERIA AND THE TEST BATTERIES, IF ANY, IN WHICH THE 


Various Tests Founp A PLACE 


NAME or TEST 

pe ee ee ee 
eee re 
(1) Best amewer Crermam No. B)........ ccc ccc cccccccccccccces 
(1) (2) Word meaning (Terman No. 3).................-000eeeeees 
(2) Logical selection (Terman No. 4)..............2eeeeeeecee 
EE ee 
(1) (2) Sentence meaning (Terman No. 6)..............-.00eeeeeee 
i . . cseeetbew eee neeseee ewe 
(3) (4) Mixed sentences (Terman No. 8)...............000 0000005: 
Classification (Terman No. 9)................2-ceeeeeeeees 
(4) Number series (Terman No. 10).............-..0-ceeeeeees 
Hoke’s Prognostic Test (complete)...................00000005. 
Motor reaction—dotting squares (Hoke No. 1).............. 
(3) (4) Speed of writing (Hoke No. 2)................ ce eee eeees 
Quality of writing (Hoke No. 3).............cccccccccceces 

Speed of reading (a kind of completion test) (Hoke No. 4)... 
(1) (2) (3) Spelling—choice between correctly and incorrectly spelled 
ee ks i ce eedkeenenae ene ls 
(4) Symbols—a digit—symbols substitution test (Hoke No. 7).... 
Downey Group Will-temperament Test (complete) .............. 

Speed of decision—self estimates of character traits . 

Coordination—ability to write down long words on short — 


Freedom from Load—difference between speeded writing and 
i (65... ee esis denen smadadweemene 
Motor inhibition—test of how slowly pencil can be moved 
Oe 6 nthe eee ned at nhabbeweake sees be 

(3) Volitional perservation—ability to disguise handwriting...... 
Interest in detail—ability to imitate handwriting............ 
Teta ees. ae las eseedecetoeee hase ees 
i che beta chine 6b bEbeN eh ebee sss aecealee 
ETE OE ee ee 
Teen ee es ce cueeneereabase 
Easy directions (Woodwroth-Wells)..................200ceeeee 
ie Pin ek ond ode ednseeateneeeeseesace 
Coordination of reaction (dotting squares, Henmon)............. 
Courtis addition—fundamentals..... 2.2.0.2... cece ee eee 
(1) (2) Courtis multiplication—fundamentals....................00005 


In this connection it should be noted that the relative forecasting 
efficiency of tests or batteries yielding different r’s with criteria are 
usually much greater than is indicated superficially by the size of 
the r’s themselves. The efficiency in question is, however, a definite 


function of r and is easily obtained by the formula: 


=1-vV1-?r 


CORRELATION WITH 


CRITERIA 
ALGEBRA ENGLIs# 
+ .32 + .42 
+ .07 + .23 
+ .20 + .29 
+.10 + .35 
+ .36 + .22 
+ .29 + .32 
+ .03 + .24 
+ .38 + .37 
+ .37 + .36 
+.16 +.13 
+ .37 +.15 
+ .55 + .56 
+.18 +.15 
+ .43 + .39 
+.14 + .08 
+ .33 + .26 
+ .37 + .43 
+ .46 + .27 
+ .29 + .23 
+.21 +.1 
+ .30 + .22 
+.15 + .05 
+ .30 +.17 
— .03 + .16 
+ .09 +.11 
—.02 + .01 
+ .06 +.10 
—.13 — .22 
— .03 —.04 
+ .28 + .39 
+.18 + .01 
— .03 —.22 
+ .25 +.10 
+ .38 +.41 


me of 
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Here E represents the percentage efficiency of the prognosis or fore- 
cast of an aptitude from test scores. 

In the case of the two test batteries just mentioned the actual 
forecasting efficiencies are as shown in the last column of Table II. 





TasBce II 
_ Correlation | Forecasting 
Group test _ Criterion | coefficient, | efficiency, 
r E 
Hoke (complete)............... | English .56 17.1% 
Terman (complete)............ | English .42 | 9.2%, 
Hoke (complete)............... | Algebra 55 | 16.5% 
Terman (complete).............| Algebra | .32 5.3% 
| 
Hoke (complete)............... Shorthand 36! 6.7% 
Hoke (complete)............... Typewriting | . 22! 2.4% 
| 








1 Taken from Limp’s manuscript previously mentioned. 


Hoke’s battery is seen to be about twice as efficient as Terman’s on 
English and about three times as efficient on algebra. These results 
are particularly striking because Hoke’s battery apparently was never 
intended to be used in forecasting aptitudes in anything but short- 
hand (and possibly typewriting) whereas, Terman’s tests seem to have 
been designed more or less definitely to reveal a kind of average apti- 
tude in the ordinary academic subjects. It is interesting to note, 
however, that several of the separate tests of Terman’s battery in the 
case of algebra, each yield actually higher r’s with the criterion than 
the entire battery, themselves included. Because of inappropriate 
weighting, the value of these really good elements for this particular 
aptitude is quite lost, for obviously a prognostic efficiency if only 5 
or 10 per cent is scarcely worth the expense of testing. 

Perhaps even more striking than the above is the fact that Hoke’s 
battery (complete) gives a much higher correlation with both English 
and algebra than with shorthand itself. In terms of forecasting effi- 





1In more technical terms, E represents the percentage reduction in the actual 
error of estimating, or forecasting, an aptitude over the amount of error or inac- 
curacy that would result from making such a forecast from a set of test scores 
correlating zero with the criterion; i.e., a set having a purely chance relation and 


giving no information at all. 
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ciency this battery is seen (‘Table II) to be more than twice as efficient 
on English and algebra as on shorthand for which it was originally 
designed. It is probable that, with proper weighting for stenography, 
the prognostic efficiency of this battery might be doubled at least. 
As it stands it probably does not repay the cost of using, though it 
evidently contains some very valuable elements. 

The showing of Downey’s battery of Will-temperament Tests was 
disappointing. We had hoped much from these tests especially 
because they promised to measure a number of important traits not 
touched by the ordinary intelligence tests. With very few exceptions 
the correlations with both criteria hover closely around zero. Time 
after time the attempt was made to include one or another of these in 
one of the batteries, but without success. In every case, except one, 
where a correlation with a criterion was found to be of any size it 
turned out that the test in question was so highly correlated with some 
other and better test already in the battery that the Downey Test 
would have added nothing of importance. 

Of the miscellaneous single tests, the Easy Directions test corre- 
lated with the various criteria fairly well as single tests go, but was 
rejected from the batteries because it correlated so highly with still 
better tests already chosen. Courtis’ multiplication showed up notice- 
ably better than his addition, and correlates fully as well with English 
as with algebra. This last, while paradoxical to introspection, is not 
likely to cause any surprise among experimentalists. Multiplication 
accordingly finds a place in two of the batteries. 


III 


The tests finally chosen for the two aptitude batteries not pre- 
viously reported are assembled in Table ITI. 


Tas_e III 
ENGLISH ALGEBRA 
1. Mixed sentences (Terman 8) 1. Analogies (Terman 7) 
2. Speed of writing (Hoke 2) 2. Mixed sentences (Terman 8) 
3. Spelling—recognition (Hoke 6) 3. Number series (Terman 10) 
4. Volitional perservation (Downey) 4. Speed of writing (Hoke 2) 
5. Symbols (Hoke 7) 


As already suggested, the choices in each case were made only after 
careful analysis of the available material by means of partial and 
multiple correlation. The mere fact that a number of tests are avail- 
able, each yielding a fairly high correlation with the criterion, does not 
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mean necessarily that all should be included in the battery. If two Hea e 

or more of them test practically the same type of behavior as shown ay Gi 
| by a high correlation between the tests themselves, it is largely a waste ek Be 

of time to use them all just as it would be to use the same test over and : 


over again. It may be much better to choose a test with a distinctly 
smaller correlation with the criterion provided it measures a trait not 
3 already represented on the battery as shown by a low correlation with 
y the other tests to be used. There may even be situations where a nan 
t test with practically a zero, or better a negative criterion correlation, 
S may be valuable in a battery if it has high positive correlation with 
0 another test which is itself highly correlated positively with the 
n criterion. The reason for this is that in the process of partial correlat- 
: ing the smaller of the two criterion r’s is lowered so far that it passes 
t the zero point and attains considerable size on the other side, though 
e with opposite sign. In such cases the test will receive a weight in the 
t battery with a sign opposite to that of the original criterion r. The 


same principle holds, of course, if the signs of all the r’s are opposite 
- those assumed above. 
s The weights or relative values to be given the scores of the respec- 
I] tive tests of each team were determined by means of regression equa- 
.- tions. The weights.for the various tests in the English battery are 


« 
h given by the equations: by 
yt Mark in English = 5b2X, se 29X 5 _ 19X3 “Pp 65X45 -+- 51.4. 

n Here the X’s stand for the scores received by any particular person 


in the tests of the battery, the small numerical subscripts indicating the 
particular tests as numbered in Table III. This means simply that 
the score on the first test (mixed sentences) is multiplied by .52, the 
p- score on the second test by .29 and so on with the others to test four. 
Then the products are all added, together with 51.4, the result being 
' the most probable school mark in Grade IX English possible to estimate 4 
on the basis of scores from these particular tests. { 
The weights for the tests in the algebra battery are given by the id 
equation: ee 
Mark in algebra = .69X, + .25X2 + .86X 3 + .41X4 + .45X; + 16.5 | 
where the subscripts to the X’s refer to the numbers of the tests of the 
algebra battery as listed in Table III, and the computation of the 
er most probable mark in Grade IX algebra is carried out exactly as 


id described in the case of English. 
- 1 Hull, Clark L.: The Joint Yield from Teams of ‘esis. Journal of Educational 
ot Psychology, October, 1923. 
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When each of the 73 subjects’ test scores were substituted in the 
respective regression equations and the resulting estimates of school 
marks correlated with the two sets of actual marks, the following 
coefficients were obtained: English .65, algebra .74. Correlation 
coefficients obtained from data like these are generally used to indicate 
the strength or validity of tests or batteries. The present writers, 
because of inadequate experimental data, also follow the custom 
though recognizing the practice as not entirely desirable. One reason 
for its undesirability is that criterion estimates on subjects whose 
scores were used to determine the weights in the regression equation, 
tend to give too high correlations. A second reason for its undesira- 
bility is the fact that the results of a test or battery correlated with an 
ordinary fallible or inaccurate criterion score, yield a coefficient which 
is too low. The obvious remedy for these undesirable phases of the 
situation would be to test an entirely new group of subjects and then 
correct the resulting correlation coefficients for the attenuation due 
to the inaccuracy of the criterion scores. It should be observed, 
however, that the two sources of error noted above operate in opposite 
directions and consequently tend to neutralize each other. And while 
the size of the first type of error has not been exactly determined, the 
chances are that it is smaller, upon the whole, than the second. Thus 
no serious error of overstatement is likely in the case of the above 
coefficients. 

Of much more significance than the correlation coefficient is the 
degree of forecasting efficiency which they represent. Applying to 
them the formula as before described we find the English battery 
possessed of a prognostic efficiency of 24 per cent while the algebra 
battery possesses an efficiency of 33 per cent.! 

The coefficients themselves (.65 and .74) are rather high as aptitude 
battery coefficients go, especially the latter. But when these coeffi- 
cients are found to represent forecasting efficiencies of only about one- 
fourth and one-third, respectively, a question naturally arises as to 
whether this is sufficient to warrant the expenditures of time and 
money involved in using the tests. It is quite true that there are prob- 





1 There is reason to believe that much misapprehension regarding what may 
be expected from psychological tests would disappear if the strength of test 
batteries should be reported in terms of E instead of r. The fact that values of r 
range from zero to 1.00 apparently leads large numbers unconsciously to assume 
that r represents a kind of percentage agreement between forecast and aptitude, 
which is very far from being the case, particularly in the range where such r’s fall. 
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ably very few if any tests, even those in widest use, which have a 
forecasting efficiency on any significant criterion which runs much 
above one-third and the most of them fall between 10 per cent and 
25 per cent. It should also be remembered that an r must run up to 
.87 to be 50 per cent efficient and to .80 to be even 40 per cent efficient. 
Thus aptitude testing is probably doomed forever to an efficiency less 
than 50 per cent and possibly even below 40 per cent. Naturally the 
lower limit of useful forecasting efficiency will vary with the expense 
of using the tests. With tests such as considered above, the writers 
venture the opinion that a forecasting efficiency of 20 per cent (r = 
.60) is about the lower limit. 
IV 


With test batteries provided for each of our four criteria, we may 
now consider in some detail the extent to which the various aptitudes 
of the same individual may be distinguished from each other in 
advance, by means of aptitude estimates or forecasts made from such 
test batteries. This fundamental problem may be considered from a 
number of different angles. In the first place it is conceivable that all 
aptitudes are essentially alike, all of them being so largely dependent 
upon some central factor such as “‘general”’ intelligence, that no impor- 
tant differences exist. This alone would preclude the possibility of any 
individual differential psychology such as is suggested at the beginning 
of this article. In the second place it is possible, owing to the high 
correlation existing among most psychological tests, that the aptitude 
estimates for the same individual made from the various test batteries, 
will be practically identical. This also would preclude the possibility 
of an individual differential psychology. Lastly and perhaps immedi- 
ately most important of all, we must determine what agreement, if 
any, exists between the differences among the aptitudes of an individual 
on the one hand, and on the other, differences among the estimates 
of the corresponding aptitudes of the same individual made by means 
of test batteries. These questions will be considered in the order 
indicated. 

The degree to which two variables are distinct is best shown by 
Kelley’s coefficient of alienation, k.!| The k-values for the six possible 
combinations of the four aptitudes, together with the correlation 


coefficients from which they have been derived, are shown in Table 
IV. 





1 Kelley, T. L.: “Statistical Method,” p. 173. The formula is: 
k= Si —71?. 
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TABLE I1V.—TasLE SHOWING COEFFICIENTS OF CORRELATION AND OF ALIENATION 
BETWEEN THE VARIOUS APTITUDES 





Aptitudes by pairs r k 
Shorthand and typewriting..................... .56 .83 
Shorthand and English......................... .75 .66 
Shorthand and algebra.................cceeee0. 71 .70 
Typewriting and English....................... . .34 .94 
Typewriting and algebra....................... .60 .80 
No ccccc ec eesdvaweceéneeusy .87 .49 
Te i ee a ae i ae wie .64 .73 











Here for example, it appears that the coefficient of alienation 
between shorthand and typewriting is .83, whereas the. coefficient of 
correlation is only .56. Thus aptitude scores of shorthand and 
typewriting correlate with each other considerably less than half.' 
The factors not common to these variables are considerably larger in 
their aggregate influence, than those which are common. The average 
k is .73, which indicates a very considerable possibility of differentia- 
tion among the various aptitudes. The only value which is really 
low is that between English and algebra. It should be noted, however, 
that the above mathematical possibility of differentiation is not a full 
psychological possibility. The k’s include not only the psychological 
differences between, the aptitudes but also all sorts of chance factors 
such as errors of measurement, etc. If the r’s had been corrected 
for attenuation they would have been somewhat larger and the k’s 
appreciably smaller. But even so, it is evident that with nearly 
allfof the aptitudes, there is a very considerable possibility of 
differentiation. 

We may now consider the second problem before us. Assuming 
substantial differences to exist between the majority of the aptitudes 
under consideration, are the various individual aptitude forecasts or 
estimates made from the respective test batteries, sufficiently distinct 
to make individual aptitude differentiation possible? It is quite clear, 
of course, that if, because of the high correlation among psychological 
tests, the forecasts for each given individual should be practically 
identical for all his various aptitudes, then no individual differential 


1 Curiously enough, half a correlation in this sense is not .50 but .707.' See 
Hull op. cit. p. 404. Half a correlation in the sense of forecasting efficiency is .866. 
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psychology would be possible. The appropriate test scores of each of 
the 73 subjects were therefore substituted in each of the four regression 
equations. The four sets of aptitude estimates which resulted were 
then correlated. These coefficients are shown in Table V, together 
with the corresponding coefficients of alienation. 


TaBLE V.—TABLE SHOWING COEFFICIENTS OF CORRELATION AND OF ALIENATION 
FOR THE VARIOUS COMBINATIONS OF APTITUDE ESTIMATES 














Aptitudes estimates r k 
Shorthand and typewriting..................... | .86 51 
Shoréhamd and Hmgiigh........ccccccccccccccees | .73 .68 
ok alias ude esl sale ole .57 .82 
Typewriting and English....................... , ae .90 
Typewriting and algebra....................... .23 .97 
cn dk 5.6 6 4 os ee cle ebb 64 | .73 .68 
DIESE E CLs Ge cites ketsaeakevcsukesvas | .59 .76 





Upon the whole these coefficients are even more promising for 
individual differential psychology than the coefficients obtained from 
the aptitudes themselves. The only & which is conspicuously low is 
that between shorthand and typewriting. It is interesting to note in 
this connection that four out of the five tests making up each of these 
two batteries are identical, the four scores in question merely receiving 
different weights in the two equations. As may be seen from an 
examination of Table I, the same thing holds to a considerable extent 
of all the battery combinations. For this reason as well as by reason 
of the well known tendency for all psychological tests to correlate 
highly with each other, the relatively high k values came as somewhat 
of a surprise to the writers. The implication seems to be that aptitude 
estimates made from test batteries are largely distinct even though 
there be considerable actual identity of the test scores making up the 
batteries in question. 

This brings us to the third of the problems mentioned above, con- 
cerning the differentiating power of aptitude test batteries. Accord- 
ing to the ordinary law of probability, the average person may be 
expected to possess about equal capacity on a considerable number of 
vocations. In addition we may expect that there will be some voca- 
tions in which he could not do as well as his general average and one 
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or two in which he has very, very much less capacity. On the other 
hand there will probably be some vocations in which he has distinctly 
more than his average capacity and one or two in which he has very, 
very much more than his average capacity. Now from the point of 
view, both of the individual and society, it is very desirable that the 
individual engage in that vocation in which he may be most efficient.! 
This will rarely come about by chance since in the nature of the case 
the one or two occupations in which a person may reach his maximum 
efficiency bears such a small ratio to the total number of possible 
occupations. And it is the task of individual differential psychology 
to assist the individual to find this vocation if possible. It accordingly 
becomes important to know how potent modern test batteries are to 
do this. 

In terms of the particular academic aptitudes under consideration 
in the present study, the problem may be stated as follows: If the 
aptitude forecasts for shorthand and typewriting indicate that a boy 
probably has more aptitude in the former than the latter by eight 
points of the ordinary marking system, how strong is the tendency for 
his actual grades in shorthand to be better than those in typewriting 
by a proportional amount? The method of making the determination 
was quite simple. First, the four columns of criterion scores and the 
corresponding four columns of aptitude estimates obtained by substi- 
tuting the appropriate test scores in the respective regression equations, 
were all transmuted into strictly comparable series having exactly the 
some means and standard deviations.2 Then the aptitudes were 
treated by pairs (e.g., shorthand and typewriting) as follows: The 
shorthand and typewriting criterion scores of each subject were first 
subtracted, yielding a column of plus and minus differences. Next 
the corresponding estimated scores were subtracted, yielding a second 
column of plus and minus differences. If the two test batteries in 
question were making a perfect differentiation of the two aptitudes, 
the column of estimate differences would be the same as the column 
of aptitude differences. The degree of the tendency toward agreement 
was found by computing the coefficient of correlation. The coefficients 
for the various possible differentiations are shown in Table VI. 





1 From the point of view of both the individual and of society, this statement 
needs qualifications. One of the writers expects to give this problem careful con- 
sideration in a future publication. 

2 Hull, Clark L.: A Method of Converting Test Scores into Series Which Shall 
Have Any Assigned Mean and Degree of Dispersion. Journal of Applied Psy- 
chology, 1922, Vol. VI, pp. 298-300. 
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TaBLeE VI.—SHOWING THE EXTENT TO WHICH DIFFERENCES BETWEEN APTITUDE 
EstTiMATES MaApE FROM TEST BATTERIES CORRELATE WITH DIFFERENCES 
IN THE CORRESPONDING ACTUAL APTITUDES 
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Difference between aptitude in 
shorthand and typewriting... .| 87 
Difference between aptitude in 
shorthand and English........ ite .42 
Difference between aptitude in ie ies 2! aT 
shorthand and algebra........ 
Difference between aptitude in 
typewriting and English...... | 
Difference between aptitude in 
typewriting and algebra...... ee ans yo vara 31 
Difference between aptitude in 
English and algebra.......... or a be et oa .23 





The most striking thing about these coefficients is their smallness 
as compared with most of the coefficients between the test batteries 
and their respective criteria (see Table VII). It is probably significant 
that the two poorest differentiations are between the pairs of commer- 
cial subjects on the one hand and the pair of regular academic subjects 
on the other. It is probably significant also that the best differentia- 
tion is between typewriting and English, which shows the highest k 
value (Table IV). The better half of the difference coefficients com- 
pare fairly well with the common run of aptitude coefficients obtained 
in practice though at that their efficiency is only 10 or 12 per cent. 
It is probable that the above difference coefficients (Table VI) give a 
very fair indication of the differentiating power of the frankly apti- 
tude tests in existence at the present time and of what may be expected 
for some time to come. In any case it should serve as a warning 
against assuming a high power of individual differentiation of aptitudes 
by tests merely because they correlate between .60 and .70 with 
particular aptitudes. 

Before leaving this phase of the subject it should be pointed out 
as a matter of method that the most precise way of securing a differen- 
tiation between two aptitudes is probably to construct a regression 
equation for that particular purpose. The equation would serve to 
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predict, not the aptitude in English, say, nor yet that in typewriting, 
but the difference between the two aptitudes, 7.e., predict in which 
aptitude a given individual is likely to do better and by how much. 
This has never been done to the knowledge of the writers but is per- 
fectly feasible. Nothing beyond the ordinary methods used in multi- 
ple regression equations will be required. Just how much more 
accurate such a method would be in practice is a question for future 
investigation.! It is also quite possible that the extensive study of the 
correlations between tests and the differences between aptitudes might 
throw considerable new light on the general subject of the differentia- 
tion of special aptitudes in individuals. As a practical means of 
guidance where many aptitudes were in question, however, it has 
the defect that very many more equations and estimates would be 
required than by the method used above. For example where 15 
aptitudes were involved, the above method would require 15 equations 
and estimates, whereas the method just mentioned would require 105. 


V 


It is highly probable that as our knowledge of scientific individual 
differential psychology progresses, we will discover certain aptitudes 
which are so similar that a single test battery will serve for them all 
alike. Indeed because of the multiplicity of vocations, it seems 
necessary that they should be grouped more or less under single 
aptitudes (at some sacrifice of prognostic efficiency, if necessary) if 
anything like a complete survey of the vocational potentialities of 
individualsisto becomeareality. The precise location of these aptitude 
foci or strategic vocations around which other vocations cluster most 
closely is not likely to prove either a simple task or one soon concluded. 
However, as a preliminary explorational step in that direction, we may 
raise the question as to how much better, if any, the various aptitudes 
are estimated by the test batteries designed especially for them, than 
those not intended for them at all? Obviously, if some one or two 
of the batteries will forecast all four aptitudes as well as those specifi- 
cally constructed for the purpose, the construction of the remaining 
batteries and the making of the forecasts is mere wasted effort. 
Accordingly the correlation coefficients between each of the four 
aptitudes on the one hand and each of the four aptitude estimates 
on the other, were computed. These appear in Table VII. 





1 One of the writers expects soon to publish the results of such a study. 
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TaBLE VII.—CoRRELATIONS BETWEEN THE VARIOUS APTITUDES AND APTITUDE 
EsTIMATES. THE 1r’S BETWEEN THE VARIOUS APTITUDES AND THE CORRE- 





SPONDING APTITUDE ESTIMATES ARE INDICATED BY BOLD Facrep TypE 








Aptitude Aptitude Aptitude Aptitude 
Aptitudes estimate, estimate, estimate, estimate, 
shorthand typewriting English algebra 
battery battery battery battery 
Shorthand......... -61 55 .27 .48 
Typewriting....... 31 61 .29 44 
EC Gddaes 0:56 60 51 17 -65 .59 
| .39 .27 .61 74 

















A careful examination of Table VII reveals a fairly complex situa- 
tion. Upon the whole the aptitudes are estimated much better by 
the batteries designed for them than by batteries designed for the other 
aptitudes. One exception is shorthand. This aptitude is estimated 
slightly better by the typewriting battery than by its own.! On the 
other hand no other battery does nearly so well in forecasting type- 
writing aptitude as this same battery. If one of the two batteries 
should need to be chosen for both aptitudes, evidently the typewriting 
battery would be much superior to the shorthand battery. 

Turning now to English and algebra, we find the algebra battery 
nearly as good for English as the English battery itself, and the 
English battery estimating algebra nearly as well as it does its own 
aptitude. But if a choice needed to be made of one battery to fore- 
cast both aptitudes, the choice would clearly fall on the algebra 
battery. And if a single battery must be chosen to forecast all four 
aptitudes it again would fall on the algebra battery. In the latter 
event, there would be possible, of course, no differentiation of individ- 
ual aptitudes whatsoever. 





1It will be recalled that the regression equations for shorthand and type- 
writing were based on the results from 107 subjects whereas those for English and 
algebra were based only on 73 of the same subjects. On the other hand the r’s 
in Table VII are all based on the results of the 73 subjects. This produces some- 
what different correlation coefficients in the case of shorthand and typewriting 
from those reported by Limp. 
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VI 


SUMMARY 


Group test scores and aptitude criterion scores were available for 
73 first year high school students. Aptitude test batteries were organ- 
ized for shorthand, typewriting, English and algebra. These batterie: 
correlated with their criteria to the extent of .51, .61, .65 and .74, 
respectively. When these test batteries were employed to estimate 
the nature and extent of the difference between a person’s aptitude to 
do one thing and a second thing, the correlation coefficients were less: 
.17, .23, .31, .42, .42 and .50. This shows that to distinguish the 
various degrees of different aptitudes within a single person is a differ- 
ent and more difficult thing than to distinguish the various amounts of 
a single aptitude in different persons. But the study does show that 
there probably are real differences in the aptitudes of individuals 
for such similar activities as learning the subjects taught in high school. 
The study suggests that a scientific vocational guidance may be 
realized by making forecasts on numerous type vocations from many 
test batteries made up from various combinations and weightings of a 
relatively small number of original test scores. The determination of 
the foci of vocational types and the lines of vocational cleavages are 
difficult problems which the future must face. 
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COMPARATIVE RELIABILITIES OF FIVE TYPES OF 
OBJECTIVE EXAMINATIONS 


G. M. RUCH AND G. D. STODDARD 
State University of Iowa 


Part I 


Statement of Problem.—The rapid increase in the use of objective 
examinations has rendered imperative the formulation of a series of 
recommendations for the construction and use of such methods. It is 
the purpose of this article to present experimental results which will 
assist, it is hoped, in the formulation of an empirical basis for deciding 
the relative merits of five of the proposed methods. 

At the present time objective examinations most often take the 
form of completion exercises, multiple response tests, true-false ques- 
tions, matching exercises, etc. These vary among themselves in 
ease of construction, ease of scoring, time needed for administration, 
adaptability to the various school subjects, degree of naturalness or 
artificiality, susceptibility to chance or guessing, relative ease of 
answering, etc. The choice of the particular type to be used has been 
largely a matter of personal opinion. ‘There is little known of their 
comparative merits in a critical way. 

The Related Literature.—Space will not permit a review of the few 
published studies available, or even more detailed reference than can 
be noted in the bibliography appended. The more important con- 
tributions are those of Toops, Wood, Monroe, West, Chapman, and 
Remmers. The first of these only will be reviewed here since the 
present experiments parallel rather closely the work of Professor 
Toops. 

The Present Investigation —Two forms of 50 items each, hereafter 
designated as Form A and Form B, were prepared in the nature of a 
general information test covering superficially general information 
in the field of history and the social sciences. 

Each item was stated in five different type-forms: Recall, 5- 
response, 3-response, 2-response, and true-false. By recall is meant 
the single blank completion type. The multiple-response forms 
were constructed with 5, 3, and 2 possible responses in order to vary 
the chance situation. The 2-response type and the true-false allow 
comparison of relative merits with the chance element supposedly 


equal since by pure guessing the probabilities are 50:50 between 
89 
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success and failure. Comparison of the 2-response and true-false 
types should therefore reveal whether the true-false method does 
possess undesirable characteristics arising from the suggestiveness 
of erroneous statements. Varying the number of responses from 2 to 
5 also allows testing of the adequacy of the conventional corrections 
for chance. 

A total of 562 subjects was used in the major experiment. These 
represented the entire senior classes of 24 Iowa high schools. 

In each school all the seniors were alphabetized and then divided 
into four numerically equal groups, designated as Groups A, B, C, and 
D. It will be seen that Group A included the first quarter of the alpha- 
betical list in all 24 schools combined. Groups B, -, and D were 
formed from the second, ‘third and fourth quarters of the alphabet. 
These groups contained approximately 135 pupils each and are suppos- 
edly approximately equal in ability, at least for all practical purposes. 
This is shown by the mean recall scores below. The groups took the 


tests as follows: 
Mgan or RECALL 


Scores 
Group A, recall (A and B) followed by 5-response (A and B)....... 23.9 
Group B, recall (A and B) followed by 3-response (A and B)....... 21.9 
Group C, recall (A and B) followed by 2-response (A and B)....... 23.1 
Group D, recall (A and B) followed by true-false (A and B)....... 22.7 


The purposes in having all subjects take the recall tests first were: 
First, to get a check on the equality of talent of the four groups; 
secondly, to reduce the practice effects to a minimum; and third, in 
order that the practice effects should spread equally to the other four 
types so that the reliability coefficients might be fairly comparable. 

The recall tests were given on one day and the second test on the 
following day. | 

Samples of the tests follow: 


I. Recall 


1. The American Revolution began in the year 
11. The country in which the Boxer uprising took place was 
15. Wellington led the English in the Battle of 
45. The theory of geometric progression in population and arith- 
metic progression in production was stated by 
46. The first transcontinental railroad in the United States was the 


ee Oo eS 1 Se 6 oe a 


II. 5-response 


1. The American Revolution began in 1762 1775 1783 1789 
1812 


@@ee@oeeu ees 0 








11. 


45. 


15. 


46. 


Five Types of Objective Examinations 


The country in which the Boxer uprising took place was America 
Russia Japan China South America 

Wellington led the English in the Battle of Crecy Austerlitz 
Orleans Jena Waterloo 

The theory of geometric progression in population and arith- 
metic progression in production was stated by Jevons 
Marshall J.S. Mill Malthus Adam Smith 


. The first transcontinental railroad in the United States was the 


Southern Pacific Sante Fe Northern Pacific Union Pacific 
Great Northern 
III. 3-response 


. The American Revolution began in 1762 1775 1789 
. The country in which the Boxer uprising took place was Japan 


China Russia 


. Wellington led the English in the Battle of Waterloo Orleans 


Austerlitz 


. The theory of geometric progression in population and arith- 


metic progression in production was stated by Adam Smith 
Malthus J. S. Mill 


. The first transcontinental railroad in the United States was the 


Southern Pacific Union Pacific Northern Pacific 
IV. 2-response 


. The American Revolution began in 1762 1775 
. The country in which the Boxer uprising took place was China 


Japan 
Wellington led the English in the Battle of Waterloo Auster- 
litz 


. The theory of geometric progression in population and arith- 


metic progression in production was stated by Adam Smith 
Malthus 

The first transcontinental railroad in the United States was the 
Union Pacific Northern Pacific 


V. True-false 


1. The American’ Revolution began in 1775. 


. The Boxer uprising took place in Japan. 


Wellington led the English in the battle of Waterloo. 
The first transcontinental railroad was the Union Pacific. 


. The theory of geometric progression in population and arith- 


metic progression in production was stated by Malthus. 


to the nearest half-minute. 
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The time needed for each student to complete each test (100 items) was recorded 


The Results.—Tables I to IV present the important facts brought 


It is apparent from Table I that the reliabilities 


decrease as the chance factor enters increasingly into the situation, the 
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3-response type behaving erratically, however, for reasons which are 
not evident. The reliabilitity coefficients are surprisingly high for 
tests occupying but 5 to 10 minutes (see Table IV). 

Table II shows that all of the recognition forms (multiple-response 
and true-false) are very much easier than the recall. The increases in 
mean scores follow the order of increased opportunity for guessing, 
except that true-false seems more difficult than the 2-response. This 
is in need of explanation since both are in reality 2-response tests, thus 
presenting about the same situation with respect to chance. 

The variabilities of Forms A and B are equal enough for practical 
purposes, and the changes in variability from type to type are not 
very great. 

Table III will furnish a working basis for rules relative to the 
lengths of test in terms of number of items which can be given per unit 
of time. Table IV, Item 4,.gives a better statement of the same 
facts. 

The marked differences present in the rate at which the various 
types of test items can be completed lead to some very important 
considerations. ‘Taking true-false as an example, it was found that 
the reliability of 100 true-false items can be estimated as .714 in 
comparison with .896 for 100 recall items. To answer 100 recall items 
requires 18.7 minutes while 100 true-false can be completed in 10.2 
minutes. This means that about 183 true-false items can be answered 
in the time needed for 100 recall items. What, then, would be the 
reliability of 183 true-false items? Item 5 of Table IV estimates 
this at .820, obtained as follows: 


ae er 
™ ~~ 14+(n—-1)r— 14+ (1.83 — 1.00)(.714) ~"” 

The other values under item 5 of Table IV were obtained in the 
same way. Obviously, the constant time basis is the fairest way 
of viewing our problem. The differences considered in this way are 
not very great, the recall, 5-response, and 2-response types being 
about equal in reliability for tests on constant time bases, the 3- 
response and true-false turning out to be somewhat inferior. 

Table V shows the correlations of the recall type with each of the 
others. The sums of Forms A and B are used, 7.e., 100 items. These 
correlations are all surprisingly low, ranging from .384 to .767. This 
fact would seem to indicate that there are important differences 
between the recall and recognition types in the kind of mental processes 
brought into play. It should always be remembered that we cannot 
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assure that re-statement of recall items in multiple-response or true- 
false forms does not alter their relative difficulties. However, such a 
lack of perfect agreement as our coefficients show probably cannot be 
accounted for by the matter of changed difficulties but must be in 
large part indicative of differences in the pedagogical and psychological 
characteristics of the several types. 

When the factor of unreliability is ruled out, at least theoretically, 
by the use of the principle of corrections for attenuation due to errors 
of measurement, the corrected values range from .480 to .861, coeffi- 
cients which still show that there is marked lack of identity of the five 
objective types which we have used. The correction formula is: 


r 
<¥____., where 


fee = — 
Vrey25 yyve 
rzy = the correlation between the two types, 
Tz,2z, = the reliability of type 1, 
Ty,v. = the reliability of type 2. 





TaBLe I.—ReE wrIABILiTy: Form A vs. Form B, UNCORRECTED FOR CHANCE 








Reliability 

Number : 100 items 

Type of cases SO itome (Brown’s 

Formula) 
I ES Boa aa Be ah 562 .811 (PE .010) .896 
DE, ocscs00e + eee eee wen 137 .796 (PE .021) . 886 
Pe iicserws os reesewuns 134 .598 (PE .037) .748 
ee ea err 135 .737 (PE .027) .849 
cc ood hae ae 133 .555 (PE .040) .714 














TaBLe II.—MEAns AND SicgmMas, UNCORRECTED FOR CHANCE 











Means Standard deviations 
Type 
Form A Form B Form A Form B 

Bs ebée ev eew ees 12.18 10.85 6.37 6.37 
5-response........... 27 .20 22.80 7.61 7.68 
3-response........... 30.61 26.41 5.73 6.06 
2-response........... 35.64 31.98 5.73 5.88 
Trwme-Galee.... 2.00. 30.06 27 .67 5.98 6.84 
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TaBLeE IJI.—Time 1n Minutes To Do 100 ITEMs 








Percentiles Recall 5-response | 3-response | 2-response | ‘True-false 
90 25.7 20.3 18.6 15.5 14.3 
75 21.8 18.6 15.8 13.3 12.2 
50 18.7 16.3 13.7 11.8 10.5 
25 14.4 13.6 11.7 10.3 8.7 
10 12.3 11.8 9.8 9.1 8.1 




















TaBLE 1V.—RELIABILITY COEFFICIENTS AND AVERAGE TIMES 





Recall | 5-response | 3-response | 2-response | True-false 





1. Reliability of 50 items, Form A 2s. 


Ee ee eee 811 . 796 . 598 737 .555 
2. Rehability of 100 items (Brown’s 

i teneetacases eetnewes . 896 . 886 748 . 849 -714 
3. Average time in minutes to do 50 

Dts. ; cleusteseteaaseaaneaes 9.35 8.0 6.75 5.7 5.1 
4. Number items per 18.7 minutes 

(required for 100 recall items).... 100 117 139 164 183 


5. Reliability of 18.7 minutes testing 
(average recall time is 18.7 min- 
ute) (by Brown’s Formula)........ . 896 .901 . 806 -902 . 820 




















TaBLE V.—CoORRELATIONS BETWEEN THE RECALL AND THE Four OTHER TyPEs, 
Raw AND CORRECTED FOR ATTENUATION 

















Type r (raw) r © © N 
eee rere: Terere ye . 767 .861 137 
Pec aisescécvcesbhwosbnensess | .618 755 144 
Re sc nsiisstske tb eketdeh este | .622 .713 138 
6: hence ne dae hunemalewe aes . 384 .480 133 

| 





Comparisons with Toops’ Study.—Table V1 presents the main facts 
of a similar study by Professor H. A. Toops. Comparing the entries 
in this table with those of Table IV, we note that our reliability 
coefficients are higher in general. This, however, is probably a matter 
of range of talent since Toops worked with college classes. The rela- 
tive sizes of the coefficients can be compared with more validity. 
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Three types of tests (viz., recall, 5-response, and true-false) were com- 
mon to both studies. For both studies the reliabilities ranked in 
order, recall, 5-response, and true-false. 

The same caution applies to the average time entries. The present 
investigation yielded higher mean times but this probably means 
nothing more than would be accounted for by the fact that our sub- 


TaBLE VI.—Toops’s Data: COMPARISON OF THE RELIABILITY COEFFICIENTS OF 
THE RECALL, RECOGNITION, AND TRUE-FALSE TEsTs (ALL MeTuops GROUPED 
TOGETHER) WITH CERTAIN ADDITIONS 
| | | 
Recall | 5-response | True-false 














1. Reliability (ri:) of halves, 124 cases. Two forms of 25 each. .448 . 385 .340 
2. Reliability of two 50-question sets (Brown’s Formula, n is 2). .618 . 556 . 507 
3. Reliability of 100 items (Brown’s Formula, n is 4)......... . 764 .715 .673 
4. Average time in minutes to do 50 questions.............. 6.9 5.6 3.6 
5. Number of questions per unit of recall time.............. 1.00 1.23 | 1.92 
6. Reliability of Form A with Form B when 6.9 minutes exami- | 
Sr EE 5. ncannedaskeces ebenessdbonesde tid .618 -607 . 664 
7. Reliability of Form A with Form B when 13.8 minutes time | 
| .798 


is used (i.e., time to do 100 recall items).................+. . 764 | .755 





jects were high school seniors and that our tests may have been more 
difficult relative to the group tested. The most meaningful comparison 
would be between Item 5 of Table VI and Item 4 of Table IV, thus: 


Number of items per unit of 100 recall items: 


RECALL 5-RESPONSE TRUE-FALSE 
theta rninnendedeve ee 100 123 192 
Ruch-Stoddard.............. 100 117 183 


These figures show a remarkably close agreement between the 
two studies. 


Taking our present figures as a basis, the numbers of items of each 


test type which can be answered per minute by the average grade 
pupil are: 


Recall, about 5 per minute (100 in 18.7 minutes) 

5-response, about 6 per minute (100 in 16.0 minutes) 
3-response, about 7 per minute (100 in 13.5 minutes) 
2-response, about 9 per minute (100 in 11.4 minutes) 
True-false, about 10 per minute (100 in 10.2 minutes) 


These recommendations may be useful to teachers in their present 
form although the mean times are not as desirable a method of deter- 
mining time limits as would be some more nearly “power” basis, 
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e.g., the ninetieth percentile, or that time in which 90 per cent of the 
pupils can complete all the items within their abilities. The following 
recommendations are, therefore, based upon the values for the nine- 
tieth percentiles found in Table ITI. 


Recall, about 4 per minute (100 in 25.7 minutes) 
5-response, about 5 per minute (100 in 20.3 minutes) 
3-response, about 514 per minute (100 in 18.6 minutes) 
2-response, about 644 per minute (100 in 15.5 minutes) 
True-false, about 7 per minute (100 in 14.3 minutes) 


The rather marked differences in the speed of answering the 
different type-forms of the same items shows at once that comparing 
reliabilities of tests of equal lengths in terms of numbers of items is 
not the most valid criterion of relative reliabilities. If 180-190 true- 
false items can be completed in the time required for 100 recall items, 
valid comparisons should be based upon total working time limits. This 
has been done by the use of Brown’s Formula (see Item 5 in Table 
IV and Item 7 in Table VI). Selecting the identical test-types from 
both studies we have: 


RELIABILITY FOR CONSTANT TIME Units (TIME To CoMPLETE 100 Recauu ITems) 


RECALL 5-RESPONSE TRUE-FALSE 
OCI NO ner gs aaa a . 764 755 .798 
ee .896 .901 .820 


The two studies agree upon the fact that the recall and 5-response 
types are equally good (within the probable errors) when constant 
times are used. The only disagreement centers about the true-false 
type which Toops found to be slightly the best of all while our results 
indicate fairly definite evidence of lesser reliability. More will be 
said upon the subject of the peculiar characteristics of the true-false 
examination under the discussion of Part IT. 


SUMMARY OF Part I 


1. The recognition types (multiple-response and true-false) are 
less reliable than the recall form of the same items for a constant num- 
ber of items. 

» 2. The recognition types are markedly easier than the recall form 
of the same items (a part of this difference may be a practice effect). 

3. The lessened reliability of the multiple-response types is in part 
at least due to the chance factors involved in successful responses. 
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4. Percentile time rates of completion are given in Table III as 
a basis for recommendations as to the proper length of objective exam- 
inations of the different types. 

5. The numbers of items in recognition form which can be given 
in the time needed for completion of 100 recall items (18.7) are 117, 
139, 164, and 183, for 5-response, 3-response, 2-response, and true- 
false respectively. 

6. When reliabilities are compared for equal working times (Table 
IV, Item 5), the types rank in merit; recall, 5-response, and 2-response 
are practically on a par, with 3-response and true-false somewhat 
inferior. 

7. All the objective forms are highly reliable in comparison with the 
traditional examinations requiring the same working time (no data are 
given here, but a sampling of several hundred final examinations 
vielded reliability coefficients from 0.35 to 0.90 for 30 to 60 minute 
examinations, the average being about 0.55). 

8. With minor exceptions, the present study is in close agreement 
with a former study of Professor H. A. Toops (Tables IV and VI). 

9. The correlations between recall type and the four recognition 
forms are surprisingly low, .384-.767, even when corrected for attenua- 
tion, .480-.861 (Table V). 


Part II 


The Validity of the Conventional Corrections for Chance.—The prac- 
tice has been rather generally adopted of applying corrections to test 
scores of the multiple-choice type to minimize the chance elements 
involved. This device may be symbolized as follows: 


Corrected Score = R — ~-—-4» where n = the number of responses 


from which the choice is made. Thus, for true-false tests, this formula 
becomes, simply, R — W. 

This correction procedure has been challenged a number of times 
on logical and experimental grounds. It appears to imply that test 
subjects resort to guessing when in doubt, and further guess wrong 
just as often as they guess right. Be that as it may, most test workers 
would probably agree to rest the case upon the issue: Are the test 
reliabilities increased or decreased by the use of the correction formula? 
This would seem to be the important consideration. Speculation 
about the ‘‘guessing”’ element in testing is an almost profitless task. 
The problem should be attacked directly by experimentation. This 
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the present writers hope to do very soon. For the present, the 
case can rest on certain analyses of the internal consistency of our data. 
All of the recognition and true-false test scores of the present 
investigation were corrected for chance and then re-corrected for 
reliability, Form A vs. Form B. Table VII shows the result. Com- 
parison with the coefficients for uncorrected scores of Table I shows 
that reliabilities were decreased in three out of five cases when chance cor- 
rections were applied, the greatest loss being in the true-false type. 


TasLe VII.—Rewiasinity: Form A vs. Form B, CorRRECTED FOR CHANCE 











ae | Reliability 100 
Type ‘eae Reliability 50 items; items (Brown’s 
Formula) 
is hed ene One 562 
5-respomse............... 137 | .775 (PE .023) .873 
3-response............... 134 | .675 (PE .031) .806 
BID, oc ccvccecdcens 135 .682 (PE .031) .81l1 
ES a tow danas nebans 133 .407 (PE .049) .578 











To test this situation further, a subsidiary study was made using 
the seven tests of the Terman Group Test of Mental Ability which 
involve chance answers. These reliability coefficients, corrected and 
uncorrected for chance, are given in Table VIII. In the seven tests, 
the reliabilities decreased five times and increased in two cases, the 
average loss being about .04. Due to the small number of cases (43) 
in this table, the losses are not very significant. However, viewing 
the situation broadly, there is no reason to believe that chance correc- 
tions result in any increased accuracy of measurement. If anything, 
they disturb the reliability. 


TaBLE VIII.—TrerMAN Group Trests—RELIABILITY: Form A vs. Form B 




















Reliability | Reliability 
Name Type uncorrected corrected for chance 
Di DM cptknscesseneneseeren ee 4-response .491 (PE .078) | 447 (PE .082) 
DP ci. c cc cccccccccecccvees 3-response | .403 (PE .086) | .384 (PE .088) 
Bs WOO GMMR. oc cccccccccccceccsecs 2-response .673 (PE .056) .560 (PE .071) 
6. Sentence meaning...............-..- Yes-no .532 (PE .073) | .474 (PE .080) 
 intdaecnteckdénnecewedas 4-response .533 (PE .074) | .547 (PE .072) 
8. Mixed sentences...............0002- True-false | .678 (PE .056) | .556 (PE .071) 
R  ccncnarecoeeivedawaes | 5-response | .346 (PE .091) | .412 (PE .086) 
Mean of r’s: . 522 .483. 


N = 43. 
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Reference to Tables IX and X show the changes in mean scores 
and variabilities introduced by the use of corrections for chance for 
the main investigation and for the Terman Group Test. Table IX 
should be compared with Table II of Part I in this connection. The 
correction lowers the mean scores markedly but increases the variability 
considerably. These facts are quite in line with the expectancy. The 
same summary holds for the Terman Tests of Table X. 


TaBLe [X.—MEANs AND S1aMaAs, CORRECTED FOR CHANCE 






































Means Standard deviations 
Type SEES - ————____—_—_——_—-— 
FormA | FormB Form A | Form B 
Recall ye Mabel 
S-response........... 22.80 | 18.02 8.63 | 7.65 
ee 23.05 | 17.51 8.32 | 7.97 
2-response........... 23.21 | 16.94 9.39 9.02 
True-false........... 14.09 | 10.76 7.30 | 8.62 
TABLE X.—TERMAN Group TEesTs—MEANS AND SIGMAS 
Means Standard deviations 
Type Uncorrected ~ Corrected Uncorrected Corrected 
Form A | Form B | Form A | Form B | Form A | Form B | Form A | Form B 
tie got cae fhe L | ve : 
1. (4-response)..... 17.77 | 16.26 | 17.53 | 16.00 2.18 2.71 2.38 2.93 
2. (3-response)..... 8.95 9.63 8.60 9.37 1.71 1.53 2.04 1.77 
3. (2-response)..... 25.42 26.81 24.56 26.37 3.47 2.67 4.02 3.33 
6. (yes-no)......... 17.35 | 19.77 | 16.30 | 18.77 3.26 3.25 3.56 3.77 
7. (4-response)..... 12.72 12.30 12.58 12.16 2.89 2.50 2.98 2.67 
8. (True-false)..... 13.58 | 15.09 | 12.23 | 13.86 2.54 2.72 3.16 3.34 
9. (5-response)..... 15.56 15.23 15.26 | 15.09 1.59 2.18 1.84 2.42 


























The practical import of these changes in mean scores is hard to 
judge. It may, perhaps, compensate for the relative ease of recogni- 
tion forms in comparison with the recall, but this may not be so very 
important because ease or difficulty can be controlled by the selection 
and formulation of the test statements. 

The increased variability of the corrected scores would be a real 
gain in case it accompanied increased reliability which does not 
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appear to be the case. The true variability of a distribution of test 
scores can be estimated from the formula 

o. =o obtained Wrio, 
where 7:2 is the reliability of the test scores which constitute the 
distribution. All other things being equal, the sigmas will be larger 
for the distribution of the less reliable scores (here the corrected scores, 
in general). 

Further discussion of such points should be deferred until the 
construction of Table XI has been made clear. 

An Analysis of the Validity of Corrections for Chance.—Table XI 
has been prepared to show the results of the main experiment when 
analyzed according to certain a priori assumptions about the rdéle of 
chance and guessing in the test scores. 

Column A gives the actual mean scores for the sums of Forms A 
and B uncorrected for chance. 


Column B gives the same for the scores after the R — — 


correction formula has been applied. Recall is considered as involving 
no appreciable chance influence. It is to be noted again that the mean 
scores do decrease but that they do not decrease enough even to approzi- 
mate the mean recall scores, except in the true-false form. The behavior 
of the true-false tests is perplexing when we remember that it is really 
a two-response test in effect, yet the chance correction does not bring 
the regular true-false and two-response test means at all closely 
together. Also, we note that the 5-response, 3-response, and 
2-response corrected means are equal. 

This may have a bearing on the controversy over the merits of the 
true-false examination which has been attacked as artificial and peda- 
gogically and psychologically unsound. 

Column C shows similar facts. 

The values in Column D are computed upon the assumption that 
we can accept a score of 23.0 points (the mean recall score) as meaning 
that the subjects did possess certain knowledge of at least 23.0 items 
in the total 100. Some allowance should be made for practice effects 
were it possible to determine the amount. Therefore, 77 items were 
not certainly known. 

Column E merely divides 77 by n (the number of responses among 
which the answer is to be selected, n being 5, 3, 2, and 2, respectively 
for the four tests involving chance). If chance alone is concerned, 
i.e., if pure guessing is resorted to, we should expect the numbers of 











101 


Five Types of Objective Examinations 


*‘19}U9 JOU S2IOP BUDTaTe <OUBYD |; 





8°e —- 
I'9 + 
£°s + 
9° 1T+ 
potreee 
(@— f) 


898sons ,, 3q 81 | 


[8013010043 J9A0 
», S0S80nZ ,, 93 
[BN708 JO Sso0xT] 


0°&s—- 

(D) a1008 ‘at 
4 Sossonz , 
FYBUI JO 
Jequinu [enjoy 


¢ St O'LL 
98 O°LL 
L'S% 0°22 
b's O°LL 
a O°LL 
ple (UvoUT []¥OO = ) 
4q ll UMOUY Gus}! 
sosson’ €Z JO sisvq uO 
i ri ” 
9431 Jo sequinu 38 ,, possong,, 
[wornooayy, «= 0} JequINN 
F p 


6° Ze 8b L°l4g 
a [OF 9°29 
b'9r 9° OF 0° 2° 
26 8° OF 0°0¢ 
0°00 10° 2% 0°€% 
[-@ 
1 ban (q+ ») 
= 981009 801008 
a ‘21008 pezoe00un jo 


pe}0e1100 jo suvour [enjoy 
suvoul [enjoy 


edAy, 





SNOILOGUUNOD AONVHD dO ALIGITVA JO SISAIVNY—'TX ATAV 











102 The Journal of Educational Psychology 


Column E to represent the average additional score earned by guessing 
over the 23.0 items known certainly. 

Column F shows the actual surplus of mean scores over the certain 
knowledge represented by recall (7.e., 23.0 items). In other words, 
these are the values in Column A minus 23.0 points. 

Column G compares the difference of Columns F and E, in other 
words the excess of actual right ‘‘guesses”’ over the theoretical right 
‘“‘guesses.”’ The results are interesting. In the 5-, 3-, and 2-response 
tests, the subjects actually guessed right much oftener than chance 
allows (upon the basis of our original assumption). But in the case of 
the true-false, there seems to be a tendency to guess wrong oftener 
than right. 

If this is confirmed by subsequent studies, there is some basis to 
the argument that true-false statements are confusing and unlike 
the straight recognition or recall types in their psychological demands. 
This issue is complicated, however, by the fact that the chance correc- 
tion seems to be most adequate in the true-false type. The correction 
formula over-corrects for true-false but under-corrects badly for the 
recognition forms. 

The correction operated least well for the 5-response type but 
becomes increasingly better when the number of possible responses 
is reduced. This probably arises from the fact that in 5-response 
tests, one, two, or more responses are so much “‘dead timber” which 
are immediately rejected and hence do not enter into the real chance 
situation. | 

The result is on the one hand, greater opportunity for guessing 
than n in the formula suggests with, on the other hand, resulting 
under-correction since the formula assumes all responses equally 
likely under the actual condition of the test. 


SUMMARY OF Part II 


1. The reliability of test scores when corrected for chance by 
the formula, score = R — , was decreased in three out of four 
cases, 3-response being the exception (Tables I and VII). 

2. A subsidiary experiment with seven tests from the Terman 
Group Test of Mental Ability showed decreased reliability after correc- 
tion in five of the seven tests (Table VIII). 

3. The variability of the score distributions was increased by 
correcting for chance (Table IX). 
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4. The mean scores were lowered by the chance corrections (Table 
IX). 

5. The chance correction formula works best where n is small, 
i.e., in the 2-, or 3-response tests (Table X). 

6. The tentative suggestion was made that the practice of correct- 
ing for chance be abandoned. 

7. Subjects appear to “‘guess” right oftener than chance will 
account for in the 5-, 3-, and 2-response tests, but to “‘guess’’ wrong 
oftener than right in the true-false type (Table X). 

8. There is some evidence that the true-false type possesses 
psychological and educational characteristics which are different from 
those of the other types. This needs further study. 
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THE VALIDITY OF SELF-ESTIMATE?! 
EUGENE SHEN 


Stanford University 


It is always an interesting question as to whether an individual 
can know himself better than he knows his associates. Reporting on 
the results of ratings of scientific men by themselves and their col- 
leagues, Cattell says: “‘It thus appears that there is no constant error 
in judging ourselves—we are about as likely to overestimate as to 
underestimate ourselves, and we can judge ourselves slightly more 
accurately than we are likely to be judged by one of our colleagues.’’ 
Cogan, Conklin, and Hollingworth in an investigation on the rating 
of college women by themselves and one another find, on the other 
hand, that “the individual does not judge herself as accurately as she 
is judged by her friends,” and that she tends to overestimate herself 
in the desirable and underestimate herself in the undesirable traits. * 

In a study on personal ratings, 28 individuals were requested to 
rank themselves and one another with respect to eight different traits. 
The traits used were: Intellectual quickness, intellectual profoundness, 
memory, impulsiveness, adaptability, persistence, leadership, and 
scholarship. For reasons which do not concern us here, the 28 men 
were divided into four groups, each rating a combination of only four 
of the eight traits. And because two persons failed to turn in their 
ratings, our final data consisted of 13 series of ranks in each of the 
eight traits. These ranks were then converted into scores in terms of 
standard deviations of a unit normal distribution, and ratings by 
judges of the same group on the same trait were averaged. The 
reliability for the average ratings ranged from .62 for impulsiveness 
to .91 for scholarship, all except impulsiveness having a reliability well 
above .80. 

If we take the average rating of an individual by the group (seli 
included) as a criterion and compare it with his self-estimate, we can 
derive, for the self-estimates on each trait, three measures of errors, 





1 Slightly modified from parts of a thesis written under the direction of Prof. 
Truman L. Kelley, to whom grateful acknowledgment is due for helpful suggestions 
and constant encouragement. , 

2“ American Men of Science.”’ 

3 School and Society, 11: 171-179, 1915. 
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a total error (TE), a systematic error (SE), and a chance error (CE), 
as follows: 


_ | 2(a — 2)? 
a= = N 


Sk = me 20) 


_ [3@i— a — SE)? _ ss 
CE = pa ) = \/(TE)? — (SE) 


where x, denotes the self-estimates and xo denotes the average ratings. 
The total error is then the standard deviation of the self-estimates 
from the average ratings; the systematic error is the average tendency 
of over- or under-estimation; the chance error is the standard deviation 
of the self-estimates from the average ratings after the systematic 
error is corrected. 

For comparison, the average error of the ratings of associates is 
calculated and converted into an estimated measure of the standard 
error, assuming a normal distribution of the errors, by multiplying 
the former by 1.25. Since all the individual series of ratings have a 
mean at zero, the systematic error of the ratings of associates is equal 
in magnitude to one twenty-seventh of that of the self-estimates and 
in the opposite direction. It is altogether negligible, and the esti- 
mated standard error of the ratings of associates is therefore practically 
a measure of the chance error as well as the total error. 

A comparison of the second and third columns of Table I shows 
that we tend to rank ourselves in a grou; less accurately than we 



































TaBLe I 
_ Rating associates Self-estimate 
Trait | pren | Estimate | | | 
| Se ‘standard! TE | SE | CE 
error 
| | error | 
a ee nul a areee ania ee 
Intellectual quickness......... . .40 .50 .67 | +.20 | .64 
Intellectual profoundness...... . 42 | .53 .61 —.12 .60 
Ee Seer eee lS eebe SR: toll — .09 64 
Impulsiveness............... | .59 | 74 | .67 + .29 | 60 
Pr ery .57 | - 2a 91 +.3l .86 
Dg cb > eds s deaywes 55 | .69 =| 1.00 | +.67 .74 
Lenlerdhip...........0.00005. ae ae ee ee 
Scholarship................-.-.| .39 | .49 | .47 | 4.15 | .45 
Bc cccstcctcsiccscsecess) GF | | 72 | +.23 65 
| | | (4.25) 
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rank our friends on the average. Column 4 shows that we tend to 
overestimate ourselves in most traits and to underestimate ourselves 
in afew. However, a general interpretation of the constant tendency 
in terms of the desirability or undesirability of the trait is difficult, 
since the traits underestimated, intellectual profoundness and memory, 
seem no less desirable than the other qualities, among which zmpul- 
siveness, insofar as it correlates negatively with the others, would be 
rather undesirable. One plausible explanation is to attribute the over- 
estimation of impulsiveness and the underestimation of intellectual 
profoundness and memory to a defense mechanism. We say ‘I have 
a very poor memory” or “‘I am rather impulsive” as an excuse for 
our being not so extraordinarily good in leadership or in scholarship. 
At any rate, the systematic errors are rather small, so that after their 
elimination, the chance errors are, in most cases, still larger than those 
for the rating of associates. 

We can go further and inquire whether the tendency to overesti- 
mate or underestimate oneself is more or less consistent with the same 
individual. The individuals, each rating four traits, may be divided 
into five groups according to the number of traits in which they over- 
estimate themselves. The data give the following figures: 


Of 26 individuals9 7 4 5. 1 overestimate themselves in 
4 3 2 1 OO of four traits. 


It thus seems that the constant tendency of self-estimate depends 
more upon the individual than upon the trait. There is no trait on 
which the individuals all overestimate or all underestimate themselves. 
But there are individuals who overestimate themselves on all the 
traits as well as one who underestimates himself on all the traits. If 
we consider individuals instead of traits, therefore, a larger systematic 
error and a smaller chance error are expected. Accordingly we 
make a new calculation of the errors with reference to individuals 
instead of traits. The results are presented in Table II. 

Our expectations are entirely fulfilled. Of the 26 individuals, six 
have negative systematic errors while the remaining 20 have 
positive systematic errors. The average value in each case is .41. 
This reduces the average total error of .66 to an average chance error of 
40, while the standard error of the ratings of associates is .59. 

Thus the apparent inaccuracy of self-estimate is largely due to a 
systematic error of the individual—a systematic tendency to over- 
or under-estimate himself in all the traits according to the kind of delu- 
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Tas_eE II 
Rating others Self-estimate 
Judge ; | | 
Average Eatimated Total | Systematic Chance 
| error  perennedl error error | error 
| error 

A .46 57 1.33 —1.31 . 22 
B 41 .52 .21 + .10 .19 
C .52 .64 .22 + .07 .21 
D .48 .60 45 + .43 .12 
E 45 .56 .48 — .26 .40 
F .55 .68 52 + .06 .52 
G 45 .56 .61 + .11 .60 
H 51 .64 .85 + .85 .00 
I .56 .70 57 + .54 19 
J .55 .69 .65 + .62 .19 
K .57 71 1.28 +1.16 .53 
L . 54 .67 1.19 — .16 1.18 
M .43 54 .90 — .50 .74 
N | .40 51 .97 + .81 54 
O .48 .59 74 + .72 .16 
P .42 .52 1.32 +1.23 47 
Q 51 .64 .74 + .15 .72 
R .33 41 .36 + .03 .36 
S 34 .43 .46 .00 .46 
s 4 41 .52 .20 + .10 18 
U .46 57 .37 + .1l .36 
Vv 47 58 65 | + .22 61 
WwW 45 .57 .52 —- 21 | AT 
x .43 .53 .74 + .52 | .53 
¥ .52 .64 .43 + .26 .34 
Z | .50 .62 .29 + .24 17 
Average 47 .59 .66 + .22 .40 

| (+ .41) 





sion that he has about himself. Although, therefore, an individual 
is likely to rank himself in a group less accurately than his associates, 
he really knows himself well in that he knows his relative strength in 


the various qualities rather accurately. 














A NOTE ON THE ORGANIZATION OF DRILL WORK? 


F. B. KNIGHT 
State University of Iowa 


The purpose of this note is to point out certain aspects of the psy- 
chology of learning as applied to drill work in Grade VI arithmetic. 
Drill work has many purposes. The only one of the purposes that we 
are concerned with here is that of keeping a skill strong and workable 
after pupils have gone to the trouble of building up that skill. Divi- 
sion of fractions and small mixed numbers is used as a typical skill. 
It is customary to build up the skill of dividing by fractions and small 
mixed numbers in Grade V. Therefore the duty of the sixth grade 
relative to division by fractions is not to build the skill but to maintain 
it. Standard textbooks of Grade VI arithmetic accept this responsi- 
bility. Drill upon the division of fractions is included in these books. 
But the practices or means by which this skill is maintained in the 
several books reported below vary so widely that there is evidence of 
a serious difference of opinion as to how pupils keep skills once they are 
built. It is fair to say that the books we shall use in this note either 
have been built on no conscious psychology of learning whatever 
or their authors disagree upon certain principles of habit formation 


that are commonly assumed to be of practical as well as theoretical 
importance. 


I. DistRIsuTED PRACTICE vs. BUNCHED PRACTICE 


One principle of maintaining as well as building a skill is that, with 
a given amount of practice available, it is better to spread the practice 
over the total learning period. The following chart will show the 
various degrees of respect paid to this principle of distributed practice. 

Five reputable textbooks of Grade V arithmetic are used for 
illustration. The progress of a class through each textbook for the 
school year was estimated by eight competent and experienced teachers 
of arithmetic. Of course all classes do not go at the same rate and no 
one class may go at exactly the rate used here. Each textbook was 
divided into 36-week sections by each one of the eight judges estimat- 
ing the natural progress of a class through each book separately. The 
eight judgments in every case were then averaged. This average was 
used as the 36 weekly divisions of each text. In the chart below every 





1 Acknowledgment for analysis of the five texts is made to Miss Florence Scott 
and Miss Amelia Blankenship. 
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TABLE SHOWING THE DISTRIBUTION OF EXAMPLES IN THE DIVISION OF FRACTIONS 


During a 36-week School Year for Five Grade VI Textbooks 


Number of Example 








| 
Week | Text A Text B Text C Text D | Text E! 

| ! 
1 0 | 0 0 0 | 1 
2 0 0 16 0 2 
3 64 | 6 | 0 ! 0 | 3 
4 | 0 | 38 40 0 1 
5 | 0 | 31 | 0 0 | 3 

| | 
6 0 0 0 0 3 
7 0 0 0 0 | 3 
8 0 0 0 0 | 4 
9 0 0 | 0 0 | 3 
10 0 0 0 0 2 

| 
11 0 | 0 | 0 0 | 3 
12 1 0 | 0 0 5 
13 0 0 0 17 1 
14 0 0 | 0 15 | 3 
15 0 0 | 2 0 5 

| 

16 0 0 | 0 2 1 
17 0 0 | 0 1 3 
18 0 0 | 0 | s 3 
19 0 0 | 0 | 0 2 
20 0 0 | 0 14 2 
21 33 | 0 0 3 4 
22 41 0 0 2 1 
23 14 | 0 0 0 2 
24 | 0 0 0 0 | 3 
25 0 0 0 0 | 2 

| | 
26 | 0 0 2 0 | 3 
27 0 0 4 0 | 2 
28 | 0 0 0 1 | 1 
29 0 0 0 | 4 | 3 
30 | 0 | 0 13 | 0 3 
31 0 | 0 0 | 0 2 
32 0 | 36 0 | 1 4 
33 0 0 0 | 4 3 
34 8 0 0 | 0 1 
35 7 0 9 | 11 2 

| 

36 7 o | 1 0 | 1 





| 
| 





168 112 84 85 88 





1 Certain examples in the division of fractions included in Text E are omitted from this{ count 
because they are incorporated in a series of inventory units which are not paralleled by similar 
units in the other texts. 
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week represents the amount of work that a class would do if the total 
book is finished during the year. Since the amount of material in 
the texts studied varies, classes using one text might have to go faster 
per hour, or spend more hours, or omit more material than a similar 
class using another of the texts. These considerations, however, are 
not vital to the consideration of distributed vs. bunched drill. 


COMMENTS ON THE DISTRIBUTION OF PRACTICE REPORTED ABOVE 


1. If the organization of drill in Texts A, B, C, and D is correct, 
then the organization of drill work in Text E is wrong, and the reverse. 
While no important significance should be attached to minor differ- 
ences in drill organization, it is obvious that the theories underlying 
the construction of the five texts above are radically different. Texts 
A, B, C, and D practice the theory that huge amounts of drill infre- 
quently is an economical method of maintaining a skill. Text E is 
built on the theory that a little drill weekly is an economical method 
to use in maintaining skills. It is hardly possible that both theories 
are equally sound. 

2. It might be urged that the Texts A, B, C, and D assume the 
use of drill pads. However, while there are available at the present 
time various drill services for whole numbers there is no widely used 
drill service for fractions. 

3. There is a difference of opinion as to the amount of drill that is 
necessary. ‘Text A uses the same theory of infrequency of practice 
as does Text C. Yet Text A gives exactly twice as much practice. 
Here again both texts cannot be right. If 168 examples in the division 
of fractions is the correct amount of practice for a Grade VI text then 
Text A has made adequate provision but Text C fails to give enough 
practice. However, if about 90 practices during the Grade VI is an 
adequate provision then Text A provides wasteful over-practice while 
Text C provides sufficient drill. 

The difference in the amount of drill between Text A and Text E 
may be explained by the difference in the distribution. It is possible 
that Text E with its distributed practice will give as much or more skill 
with 88 practices as one will get from the 168 bunched practices of 
Text A. 

It is also possible that the authors of the texts desire different 
amounts of skill. If this is so, we have a defense for the varying 
amounts of practice. The fact is that Text E alone gives any usable 
information about standards of attainment in the division of fractions. 


ews os 5 


— set te et Oe tlt OS 
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It is doubtful if the problem of attainment in measurable terms was 
dealt with seriously by the other texts. 


II. PRACTICE ON ALL THE UNITS OF A SKILL VS. PRACTICE ON A FRAG- 
MENT OF THE SKILL 


It has been pointed out before that in building drill material it is 
not at all unusual to find a systematic review or drill upon some func- 
tion which does not include all the units of that function. Thus, it is 
possible to find in standard texts in a periodic review of the multiplica- 
tion of whole numbers the combination 6 X 6 practiced 10 times but 
the combination 9 X 7 not practiced at all. The theory underlying 
this loose writing of drill material is that practice upon a function or 
parts of it in general will strengthen the total function. No doubt 
there is some transfer. But there are no experimental data showing 
that the transfer between isolated fact units, such as 6 X 6 and 9 X 7, 
is so great that it is not worth while to take the pains needed to so 
build drill material that every combination is practiced with calculated 
frequency. 

In order to insure that every unit of a skill is practiced, it is advis- 
able to have at hand an analysis of that skill in terms of the unit skills 
which in combination make the total skill. Such an analysis for 
whole numbers is comparatively simple. It is not at the present time 
known whether we have in existence an adequate analysis of the divi- 
sion of fractions and small mixed numbers in terms of the unit skills 
involved in that complex function. The following analysis, doubtless 
open to improvement through expansion or condensation, was used in 


checking the frequency of practice on each item of the total skill of 
dividing fractions in the five textbooks. 


ANALYSIS OF DIVISION OF FRACTIONS IN TERMS OF THE LEARNING 
Process! 


I. As to the Form of Stating the Example 
(A) Fractions written in figures 

| I ee ee 

2. Words “‘divided by,”’ as 4¢ divided by 4% 


3. Complex fractions, as +7 


eeetceoecascneseoeoen ee eesceaeeaeaesGeoese0eeee28 6 @ 


- WS Ne 


4. Division indicated by brackets, as 14)42, or 4)% 


“eee ee eeee 


a4 This analysis is taken from ‘Problems in the Teaching of Arithmetic” by 
Knight, Luse, and Ruch. lowa Supply Co., Iowa City, Iowa. 
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(B) 
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II. As to Procedure 


(A) 


(B) 
(C) 
(D) 


Fractions written with words ae No. 
1. Indicated division, as two-sevenths + one-sixth.............. 5 
2. Words ‘‘divided by ”’ used, as one-eighth divided by three-fifths. 6 
Nature of terms—expression of all terms as fractions 
1. Unit fraction + unit fraction, as 1g + \%........ 7 
2. Unit fraction + other proper fraction, 4 + %... 8 
3. Unit fraction + improper fraction, as 4 + 74............. 9 
4. Unit fraction + mixed number, as 44 + 2%............... 10 
5. Unit fraction + whole number, as 4 +7................. 11 
6. Other proper fraction + unit fraction, as 34 + W.......... 12 
7. Other proper fraction + other proper fractions, as a4 + nih 13 
8. Other proper fraction + improper fraction, as 3g + 44.. 14 
9. Other proper fraction + mixed number, as {9 + 1%....... 15 
10. Other proper fraction + whole number, as 44 + 8.... 16 
11. Improper fraction + unit fraction, as 4% + \&.............. 17 
12. Improper fraction ~ other proper fraction, as 54 + %...... 18 
13. Improper fraction + improper fraction, as 54 + %......... 19 
14. Improper fraction + mixed number, as 9% +1%........... 20 
15. Improper fraction + whole number, as 7% + 3............. 21 
16. Mixed number ~+ unit fraction, as 14% + %............... 22 
17. Mixed number ~+ other proper fraction, as 384 + 44........ 23 
18. Mixed number + improper fraction, as 524 + 134......... 24 
19. Mixed number +.mixed number, as 424 + 7%............. 25 
20. Mixed number + whole number, as 3% + 5............... 26 
21. Whole number ~+ unit fraction, as 8 + W&%................. | 27 
22. Whole number + other proper fraction, as 9 + 3%%.......... 28 
23. Whole number + improper fraction, as 6 + 7%............. 29 
24. Whole number + mixed number, as 4 + 35¢............... 30 
25. Whole number ~+ larger whole number, as 7 + 15.......... 31 
Ee Oe Gn a ae ed ee 32 
sh ae EN Sa eae oe 6 ie he ciale bee ke 6 ks 33 
Cancellation 
1. No cancellation possible, as 3% X 44....................04. 34 
2. Single cancellation 
(a) One number a factor of the other, as %& & %............ 35 
(b) Two numbers with a common factor, as 8 XK 5%......... 36 
3. Double cancellation 
(a) In each case one number a factor of the other, as 34 K 8%... | 37 
(b) In each case two numbers with a common factor, as 8% X 
RAI g ey SE A AE alge Pie ml ied refi Rnd Sekar 38 
4. Rec. One case of each type of cancellation, as {9 + %....... 39 
5. Reduction cancellation 
(a) One number a factor of the other, as 2g................. 40 
(b) Two numbers with a common factor, as 8{9............. 41 
6. Incomplete or continued cancellation 
(a) One number a factor of the other, 86 + 144 +1\s5....... 42 


(b) Two numbers with a common factor, 15749 + 234, + 
andi: ata Ee are ind Ws eae ie 6 Re OS. 0 eo bead ak ae 
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(E) Multiplication aa i. 
1. Neither factor unity in numerator or denominator, as 344 Kk %. 44 
2. One factor unity in numerator, as 4 X 5.................. 46 
3. One factor unity in denominator, as 3% X 4................. 465 
4. Both factors unity in numerator, as 4 x \%................ 47 
5. Both factors unity in denominator, as 10 K 34.............. 48 
(F) Analysis of quotient 
1. Quotient a whole number 
(a) Cancellation complete—irreducible as 7{................. 49 
(b) Cancellation incomplete—reducible as 1°4............... 50 
2. Quotient a proper fraction 
(a) Cancellation complete—irreducible as 2g................ 51 
(b) Cancellation incomplete—reducible 
1. Numerator a factor of denominator, as %{5........... 52 
2. Numerator and denominator having a common factor, 
as Ko weer Te Pee ee eR CORE CL OL UTE 53 
3. Quotient an improper fraction—reduce to mixed number 
(a) Cancellation complete—fraction irreducible, as 134..... 53 
(b) Cancellation incomplete—fraction reducible, as 2\%..... 55 


The above analysis is apparently logically complete. It contains 
certain unit skills which would not be expected to appear in drill 
material. The unit skills numbers 3, 4, 5, 9, 14, 17, 18, 19, 20, 21, 31, 
40, 41, 50, while mathematically and logically possible would not 
ordinarily appear in drill material. If then texts provide no practice 
on these unit skills, it is not to be taken as a sure evidence of 
carelessness. 








| 
| 
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The table below shows the frequency with which each unit skill in 
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the division of fraction receives practice in five Grade VI texts. 


TaBLE! SHOWING THE FREQUENCY OF PRACTICE ON Unit SKILLS INVOLVED IN THE 
Division oF Fractions IN Frve Grape VI TEexTBooks 








Unit skill Text A Text B 
number f f 
1 161 | 112 
2 0 0 
| 

6 7 0 
7 0 0 
8 0 0 
10 0 0 
11 0 0 
12 0 0 
13 13 | 38 
15 0 | 0 
16 10 0 
22 0 | 0 
23 2 | 18 
24 5 | 0 
25 16 8 
26 6 23 
27 18 0 
28 15 10 
29 0 1 
30 15 10 
34 40 76 
35 86 1 
36 3 3 
37 3 21 
38 5 0 
39 4 | 0 
42 0 | 0 
43 0 | 0 
44 35 | 90 
45 10 17 
46 28 4 
47 6 4 
48 2 6 
49 45 50 
51 27 24 
52 0 6 
53 18 2 
54 45 33 
55 5 1 

















Text C 
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0 
0 
29 
20 
8 
0 
32 
34 


13 
18 
14 
7 
0 


| 
| 





Text D Text E 
f f 
85 84 

0-— 4 
0 0 
0 0 
2-— 2 
0 0 
6 4 
8 6 
0 18 
11 

4 

2 1 
2 7 
0 0 
25 14 
7 4 
1 3 
4 5 
0 1 
23 4 
24 35 
34 30 
3 5 
21 9 
0 1 
0 6 
0 1 
0 0 
62 46 
26 30 
29 30 
4 121 
38 10 
36 13 
19 35 
0 1 
0 1 
31 37 
0 1 








1 Those unit skills which would not ordinarily appear in drill or which are taboo, or which are 
common to all examples are omitted from this frequency table. 
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COMMENTS ON THE DISTRIBUTION OF PRACTICE AMONG THE SEVERAL 
Unit SKILLS INVOLVED IN THE DIVISION OF FRACTIONS 


1. The texts vary in their utter omission of certain elements of the 
division of fractions. Of the unit skills listed above: 


Text A gives no practice to 12 of the unit skills. 
Text B gives no practice to 16 of the unit skills. 
Text C gives no practice to 18 of the unit skills. 
Text D gives no practice to 16 of the unit skills. 
Text E gives no practice to 5 of the unit skills. 


It is evident that these texts vary in the amount of faith held in 
transfer. To leave out 18 of the unit skills involved in the division 
of fractions assumes a transfer within a function of a type and of an 
amount that no published experimental data support. 

After studying the analysis given above and the omissions in the 
table, the reader will be hard put to it to justify many of the omissions. 

2. These texts vary in their opinion concerning the amount of drill 
that each unit skill should have. Some of the unit skills receive much 
practice in one text and little or none in another text. Thus children 
studying Grade VI arithmetic from different texts receive a very 
different experience in the division of fractions. It would be worth 
while to know the relationship between the amount of drill in terms 
of the unit skills involved and the frequency of their errors in standard 
tests. 

3. Since the relative difficulty of the several unit skills involved in 
the division of fractions is not known and since the amount of transfer 
among the unit skills is not known the writer of drill material is on 
uncertain ground when he purposely omits or'gives very little practice 
to any of the more common unit skills. 

4. It is perfectly practical to build drill to specifications. As an 
illustration of this a sample of drill material in the addition of fractions 
isinserted. The unit skill numbers refer to an analysis of the addition 
of fractions and mixed numbers. 
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Unit SKILLs 





SampLe Dri. NUMBER 
I Siar eo Ls 4 heady bee deahiecnes ce walk 1-6—12-16-19 
2. \% 

1 

hs ciabatta eal dial 2-6-12-16-19 
3. Add two-thirds plus one-fourth = 1%»9................... 5-6-12-16-19 
er dane aaeaed ea ecrun kes 1-6-12-15-20 
a ae tab ab we OO rk Oe Oka eae ben 1-6-12-15-22 
cacti a ak a ae g ibaa 4 4B week mee 1-6—12-15-21 
I i eee ada eeeadenaseeeeees 1-7-13-15-24 
Oe I a twee 1-8-—13-—18-26 
9. Two and three-eighths plus three and two-fifths = 53 ne 5-9-12-16-24 
10. One-third, four-fifths, eight-fifteenths = 124 veceeeees. 46-13-15-23 
Bh, SS Pg acc c cease ce ccceuceea’s 3-9-12-14-27 
I oe eee. Sueaew aber 1-9-12-15-25 
ee a Og on on said ais sin bee be ewew i ewes saveenes 1-9-12-17-26 
on ki wie bin ameee eee ae 1-11-13-16-26 
GS EE | re eee 1-10-13-16-24 
16. % 

le 
34 
re emer ae ge a St er ie ee ree ere 2-6-13-18-23 


5. It is further entirely practical that committees in charge of the 
making of curricula and the providing of instruments of instruction 
analyze and appraise such instruments not only in terms of gross 
amount but also in terms of the internal organization of the material. 


III. AWARENESS OF SUCCESS AND FAILURE AT THE TIME OF LEARNING 


Another principle of learning that should be considered in the con- 
struction of drill material is the effect of knowing how well or how 
poorly one is doing at the time of practice. There is reason to believe 
that standards of performance which are fair and honest act as a direct 
stimulant of the proper type to learning. 

However since adequate standards as organic parts of drill material 
appear in only one of the texts (Text E) used in this study, a compari- 
son is impossible on this point. 


SUMMARY 


1. In organizing drill to maintain the skill of division of fractions, 
texts vary significantly in the distribution of the drill. Some texts 
give severe doses infrequently with long forgetting periods between 
the drills. Others fight forgetting by a little drill weekly. 








’ 
] 
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2. In organizing drill material, texts vary in the amount of material 
used. With the same time distribution, one text will give twice as 
much as another. It is not known which amount is correct. But 
it is possible that proper distribution renders smaller amounts effective. 

3. Texts differ in the internal nature of drill material. Some will 
give no drill on many of the unit skills or constituent elements of the 
division of fractions and very little on other elements but large amounts 
of drill on still others. There is no evidence that this distribution is 
in terms of the inherent difficulty or the social utility of the several 
unit skills. 

Other texts omit practice on but few of the unit skills and spread 
the practice more evenly, thus refusing to bank on transfer until such 
transfer has been shown to exist. 

4. It is possible to build drill material to exact specifications. 
With the use of an analysis of the skill for which the drill is constructed, 
it is practical to provide drill which practices every unit of the total 
function with a calculated frequency. As far as is known, no harm 
would be done if the specifications of drill material in terms of its con- 
stituent elements were written first and then the drill material built to 
those specifications. 

5. It is further possible that the scientific teacher may yet buy drill 
by specifications including other factors than mere amount, as engi- 
neers have long since learned to buy steel, coal, and even the paper upon 
which drill material is printed. 
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A BRIEF TEST SERIES FOR MANUAL DEXTERITY 


ESTHER C. WHITMAN 


Boston Psychopathic Hospital 


This is an attempt to standardize a test series governed by manual 
dexterity as apart from higher mental processes. Various manual 
tests in common use have other primary factors such as learning ability, 
mechanical ingenuity and reasoning, and they are mostly used to 
supplement language tests of intelligence, or to take their place when 
language difficulties arise. A test so far as possible restricted 
to manual ability has special implications for vocational guidance and 
selection. 

The present test was organized in consultation with Dr. F. L. Wells 
of the Boston Psychopathic Hospital. Children of ages from 7 to 15 
were examined in the public schools of Walpole,! Massachusetts, taken 
from the following grades: 


GRADE I W UW IV Vv VI VII VII HS. Torar 
a ee eer ee 1l 42 77 67 53 57 75 56 66 504 


The tests were given entirely by the writer. Four hundred ninety- 
one records from the 504 children examined were available for stand- 
ardization, 13 records being omitted for the following reasons: 


RRR eee Meet Cad aeann been aehve Lewes eaeewe 3 
RO OD ee ET Te ET eT en, ne 3 
cee Teac ok eee ead s kek i neediness aes hh he eees on 2 
EE eer ee ee ee ey err ere Tor eee 5 


The test is composed of seven items and takes 12 to 15 minutes to 
give. Material, instructions, and scoring for the different items are 
described below. The layout of the material is illustrated in the 
accompanying cut. The tests were uniformly given in the order listed. 
Emphasis should be given the words italicized in the instructions, this 
being where misunderstandings most frequently occur. The subject 
is told that if he happens to drop a part of the test material such as a 
peg, pin, nut, or bolt, he will let it lie, pick up another from the tray 
and go right on. 





1 The writer desires to express special obligations to Mr. Frank L. Mansur, 
Superintendent of Schools, for permission to carry on the work, and to the various 
members of his staff for their cooperation. 
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Items 1, 2; Pegboard A (I1).—(1) Pegboards A and B are part of material used 
in personnel tests by the General Electric Company, and kindly supplied by 
Mr. Johnson O’Connor, of that organization, who devised them. 

A varnished pegboard, 4 inches square, and 1 inch deep with 100 holes (10 
rows with 10 holes to a row) equally spaced one-half inch between centers, 342 
inches in diameter, 34 inch deep. For items 1 and 2, 200 brass pins, 1 inch 
length, 6g inch in diameter. 

Instructions.—One brass pin in each hole. (Examiner fills in first row.) 

You are to use your right hand (or left—the preferred hand is essential) and 
see how many of these holes you can fill with these pins before you are told to 


PINS 


es a ge aay, 











Pegboards A and B 
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pegs Items 1,2,3, Rac 
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Poyard Popboord 






























































S S 
Item 4,R Item 4, L 
bolts nuts nuts and bolts nuts and nuts bolts 
¢ aE Rengaill 
eh ER i oy Se os a 
S 
Items 5, 6, R Items 5, 6, L 
re pe 
SR OH te: eb ae, set EE OE ee RD Se 
S S 
Item 7, R Item 7.L 


Plan showing the layout of the Adjusto tray, the pegboards, and other material for 
the test. S ‘denotes the position of the subject, facing the material. R denotes 
positions for right-handed subjects, L for left-handed subjects. 


stop. But remember that you are to use only your right (left) hand and pick up 
only one pin at a time. Now have a pin in your hand and be ready to start. 
Ready—Go. After one minute—Stop. 

Score, one-third the number of holes filled (to nearest integer), e.g., 29 holes 
score 10, 28 holes score 9. 

Item 2.—Other hand. Give as above. Time, one minute. 

Score, one-third of holes filled. 

Item 3; Pegboard B (I1).—Same as Pegboard A except that the holes are 3{¢ inch 
in diameter and 34 inch deep. Pins used in item 1. 
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Instructions.—Three pins in each hole. (Examiner fills in first five holes.) 

This time you are to use both your hands and put three of these pins in each 
hole—you can do it anyway you wish as long as you use.both your hands. Just 
have three in your hand for the first hole and then do it anyway you wish—Ready— 
Go. After two minutes—Stop. 

Score, the number of holes correctly filled. 

Item 4; Large Pegboard (standard educational equipment).—Ten inches square, 
14 inch deep—100 holes, 34, inch in diameter, 44g inch deep, spaced one inch 
between centers. Adjusto Tray. 

For items 4 and 7, 100 pegs, 2 inches long, about 3{¢ inch diameter, 20 of each 
of the colors red, orange, yellow, green, purple (standard educational equipment). 

Instructions.—(Examiner fills in first two horizontal rows.) 

With this board you are to use only your right hand again and pick up one peg 
at atime. First you will put in a red peg, skip a hole, then an orange one, skip 
a hole, then a yellow, then a green and last a purple, remembering to skip a hole 
between each color. But remember to do each row in the same order, red, orange, 
yellow, green, purple, and see how many rows you can do before I tell you to 
stop. Now have a red peg in your hand and be all ready to start—Ready—Go. 
After one minute—Stop. 

Score, number of pegs placed according to directions. 

Item 5; Nuts and Bolts, Adjusto Tray.—In items 4, 5, and 7, this tray should 
with right-handed subjects be placed with the large compartment (see Figure) 
at the right of the subject, and at the left for the left-handed. 

Twenty bolts, one inch long, 542 inch in diameter. (Stove bolts, 16 threads 
to the inch); nuts, 34 inch square, to fit same. 

Instructions.—(Examiner disassembles two nuts and bolts.) 

You are to take off as many of these nuts as you can—put the nuts in one place 
(compartment) and the bolts in another. Have one in your hand so that you will 
be ready to start. Ready—Go. After 30 seconds—Stop. 

Score, twice the number disassembled. 

Item 6. | 

Instructions.—(Examiner assembles two nuts and bolts.) 

Now you are to put together as many as you can—just put the bolt on far 
enough so that the top of the bolt shows over the nut and then put it in the tray. 
Have one in each hand (bolt ahd nut)—Ready—Go. After 30 seconds—Stop. 

Score, twice the number assembled. 

Item 7; Tray, Adjusto Tray (standard office equipment). 

Long shallow tray, olive green, about 21 inches long when adjusted to four 
small and one large compartment, about four inches wide. Colored pegs used in 
item 4. 

Instructions.—(Examiner sorts out colors twice.) 

This time you are to sort as many pegs as you can in these compartments, in 
the first compartment red,! next yellow, in the next green, and last purple. But 
always do it in the same order, first red, then yellow, green, purple and use only 
your right hand—Ready—Go. After 30 seconds—Stop. 

Score, number of pegs sorted according to directions. 


1 The orange pegs are present but not used. No allusion is made to them unless 
confusion results, which is rare. 
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The tests were given to children ranging from 7 to 15 years of age 
as follows: 


YEAR 7 8 9 10 11 12 13 14 15 
Pb vesecksccé 19 29 24 29 26 28 34 28 28 
Stacie cisses Jae 29 31 22 26 29 42 24 «#138 
Nin sé Sen hes 49 58 55 51 52 57 76 52 41—491 


The writer attempted to examine 50 cases in each year but as the 
work was done by grades to simplify giving the group tests, the groups 
are not precisely equal. In arriving at the percentiles, a five-step 
interval was used, e.g., cases scoring a total from 45 to 49.9 fell under 
the same group. Percentiles are given for each year, that is all cases 
with ages ranging from 7 years, 0 months to 7 years, 11 months fell in 
the distribution curve for 7 years. ‘The median scores are also given 
for each item at each year. 


YEAR 7 8 9 10 11 12 138 #14 = 15 
I. o kinccac ce scuwns 86 82 91 102 106 112 114 116 124 
GRA 6:6:0.4.0.00. 010 sv <eebue 69 74 82 94 99 102 109 109 115 
PS i tvicwcescnctene 64 65 73 84 92 91 99 98 106 


In year 14 the great majority of cases are on the low-score side of 
the mode, high scores being relatively absent, but this is not the case 
with year 15. The 14-year group means to have been selected in some 
unexplained way; it shows no improvement over year 13, while the 
score of year 15 is again increased. No marked sex difference appears 
in these quantities though at seven of the nine-year levels the scores 
for the boys are higher. 


MEDIAN SCORES FOR THE SEPARATE ITEMS 











Item, year | 1 | 2 | 3 4 B -+] 6 7 
} } 
7 7.7 | 7.4 16.2 11.5 | 5.7 11.2 | 14.5 
8 8.0 7.6 | 17.2 | 13.5 | 6.0 | 11.5 | 15.6 
9 8.7 | 7.8 | 18.2 | 13.9 | 7.4 | 18.4 | 17.5 
10 9.3 | 8.4 | 221 | 193 | 7.7 | 13.5 | 18.1 
11 9.4 | 9.0 | 22.8 | 21.4 | 80. 147 19.9 
12 10.0 | 9.1 | 23.9 } 19.8 | 9.2 | 15.7 | 20.9 
13 | 10.3 | 9.3 | 25.1 | 21.6 | 9.7 | 15.4 | 21.6 
14 | 10.1 | 9.38 | 25.2 | 20.0 9.6 | 16.3 | 21.9 
15 | 10.4 | 9.5 | 26.2 | 23.3 | 11.5 | 16.7 | 24.1 
| 
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The most striking feature of this table is the difference in the way 
the tasks are affected by growth. Items 1 and 2 are relatively little 
affected; 4 and 5 quite markedly. 

Four hundred and thirty-four children were given the ‘Myers 
Mental Measure” and the scores were correlated with the scores in 
the tests of dexterity. A correlation for the entire group was + .65, 
PE .019, which figure is governed by the improvement in both classes 
of performance with age. The correlations by years were as follows: 








Year Cases Correlation PE 
7 41 +.31 .10 

8 50 + .37 .08 

9 46 + .35 .09 
10 45 + .34 .09 
11 48 + .40 .08 
12 51 +.50 .07 
13 67 + .37 .07 
14 50 + .40 .08 
35 + .26 ll 














These figures were calculated by the Pearson formula applied to 
the scatter diagram (guessed average method).' Scattering seems to 
limit their reliability more than does the relatively small number of 
cases in each group. 

Intercorrelation between certain items was calculated by the per- 
centage of cases on opposite sides of the median, but is expressed in 
terms of r, (Rugg, Statistical Methods Applied to Education, 1917, 
p. 403). ° 


PEGBOARD A AND ASSEMBLING Bouts (ITEMs 1 AND 6) 


i vtiende awe es 7 8 9 10 11 12 13 14 15 
Wisden senseeanens .03 .21 .53 .53 .66 .038 .66 = © .58 = 38.27 
PEGBOARD A AND PEGBOARD B (Items 1 AND 3) 
ee 7 8 9 10 11 12 13 14 15 
Pie Caan ekwardeees .84 .61 .48 .61 .24 .45 .56 .48~= = .18 
PEGBOARD A AND LARGE PEeGBoarD (ITEMS 1 AND 4) 

Be tansccseaenees 7 8 9 10 11 12 13 14 ¢:,15 
Vidkuktscnedeoadens 36 .48 .61 .56 .36 § .3860 6.45) 8 .24)% .56 





1 For essential help in these computations, the writer is indebted to Miss 
Margaret S. Child of the Boston Psychopathic Hospital. 
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None of these relationships is so clear as to suggest the advisability 
of eliminating any of the tests concerned on the ground of duplication. 
In view also of the marked differential effect of age on the various 
items, a general score for the whole test (such as the sum of earned 
points) is unrepresentative in the same way that an intelligence quo- 
tient does not represent high or low performances in digit span, or visual 
imagery. A profile (Rossolimo) method of presenting such results, 
such as is utilized in Wells and Martin’s Memory Test Series,! may 
well gain in significance what it loses in cumbrousness. 


1 American Journal of Psychiatry, Vol. III, 1923, pp. 243-257. 
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THE DISTRIBUTION OF INTELLIGENCE AMONG 
COLLEGE STUDENTS 


STUART A. RICE 
Dartmouth College 


That each individual possesses innate intelligence capable of devel- 
opment to a certain fixed level but no farther is a theory which has 
received its recent impetus from the Army Alpha tests. A supple- 
mentary theory supported by these tests is that intelligence is dis- 
tributed in the population normally, or according to the curve of 
error. Thus, assuming that the one million seven hundred thousand 
drafted men examined were a fair sample of the population of the 
United States, we are led to infer that 25 per cent of adult Americans 
are of “average” or “C” intelligence, and that the proportions of 
the population falling within classes of equal width on either side of 
this average successively diminish. On the lower end of the scale, 
the persons of C—, D and D— intelligence are respectively 20 per cent, 
15 per cent and 10 per cent of the total. On the upper end of the scale 
the proportions having C+ (high average), B (superior) and A (very 
superior) intelligence are respectively 1614 per cent, 9 per cent and 
414 per cent of the entire number.! It will be seen that the distribu- 
tion curve is approximately normal in form. 

The mental levels designated have been translated into terms 
representing capacity for performance in our educational system: 


Men of A intelligence have the ability to make a superior record in college or 
university, while D— men . , + are rarely able to go beyond the third or fourth 
grade of the elementary school . . . B intelligence is capable of making an aver- 
age record in college. C-+ intelligence cannot do so well, while mentality of the 
C grade is rarely capable of finishing a high school course.? 

Much of the discussion based on the army tests assumes that the 
educational levels reached by individuals are indications of their 
mental levels. Men who get into college belong to the upper intelli- 
gence classes of the general population, specifically to levels C+, B 
and A. If the relationship between educational level and mental level 
were high, the distribution of college students according to intelligence 
would be indicated not by the normal “bell-shaped” curve, but by the 
upper part alone of that curve, roughly that portion describing the 





1 From Report of the Surgeon General of the United States, reproduced by 
Henry L. Goddard, Human Efficiency and Levels of Intelligence, pp. 24-26. 
2 [bid., pp. 26-27. 
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upper quartile. Assuming this relationship, and assuming intelligence 
to be correlated with excellence of performance in college subjects, 
we should expect the typical undergraduate body to contain a large 
proportion of students doing inferior work, a smaller proportion doing 
mediocre work, and a still smaller proportion doing work of high stand- 
ard. If this distribution is not found, it will be an indication that 
college students are not drawn in equal proportions from the intelli- 
gence levels presumed to be of college grade—that the level of educa- 
tional attainment reached, in other words, is not in general a fair 
indication of innate mental capacity. 

The grading systems actually employed in our colleges assume that 
within the student body a normal distribution of intelligence, or at 
least of intellectual performance, is to be found. In the grading system 
used at Dartmouth College, which is probably representative, it is 
expected that one-quarter of the enrollment in a large course will 
attain the ‘‘superior”’ grades of A and B, that one-half will merit a 
C, or “average” grade, and that one-quarter will merit the D and E 
which are indicative of ‘‘inferior”’ work. 

An empirical test of this assumption usually serves to substantiate 
it. A recent examination in a large course in elementary sociology 
called for a judgment by the students of the truest proposition among 
four submitted concerning each of 20 topics covered in the course. 
While familiarity with the subject-matter of the course was essential, 
the power of logical discrimination, presumed to be closely akin to 
intelligence, appears to have been the major requisite for a high score. 
The distribution of 401 grades on a scale ranging from 0 to 50, was 
as follows:! 














Grade | Number of | Grade | Numberof | Grade | Number of 
received | students | received | students | received | students 
18 | 3 / 30 | 49 4B | 28 
20 | oa fo a Oe 10 
23 | 21 a ao 69 | 48 3 
25 | 32 oe oe 47 | 
28 | 48 40 | 31 | 








Total number of students 401. 





1 Each correct answer was given a credit of two and one-half points. No dis- 
credits were given, the subject merely failing to score in the case of a wrong answer. 
Where the aggregate grade totalled to a half-point, the grade assigned was the 
next highest whole number. It will be noted that no subjective factors were 
involved in this method of grading.’ 
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It should be noted that a grade of 13 should be expected on the 
basis of chance alone. Hence, the expected range of performance 
would extend over 16 classes in this discreet series, from grade 13 to 
grade 50, with a mid-point of the series between grade 30 and grade 
33. The actual range is from 18 to 48, with a well-defined mode at 
35. When smoothed with a three-class moving average, the distribu- 
tion curve conforms closely to the normal type, with a mode at grade 
33. This series no doubt corresponds closely to others which have 
been obtained in similar tests elsewhere. 

The result indicates that the distribution of intelligence within 
the comparable portions of the non-college population. The disparity 


avr~ 


’ P (General Population) 
? S (Student Body) 
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Fig. 1.—Apparent distribution of intelligence within the general population 
and within the undergraduate student body. Scale of intelligence within general 
population. 


between the two is indicated in Fig. 1. Curve P represents the 
apparent distribution of intelligence within that part of the general 
population from which college students are alleged to be drawn. 
Curve § represents the apparent distribution within the undergraduate 
body itself. 

We may conclude that educational attainment, while obviously 
not unrelated altogether to innate mental capacity, shows at least no 
correlation with the latter within those mental levels of the general 
population from which the American college draws its students. A 
variety of factors in addition to intelligence bring young men and 
women to college. It would be interesting to ascertain whether the 
normal frequency curve tends to disappear, and the distribution 
become wedge-shaped as in the figure by the time the gradu- 
ating class receives diplomas. Are those at the lower end of the 
entering distribution cut off during the four year effort to meet college 
requirements, or are they passed along to graduation on the basis of 
priority of residence? By the answer to this question may be gauged the 
importance of a college degree as a certificate of innate mental capacity. 
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EFFECTIVE DRILL EXERCISES IN ARITHMETIC 
R. S. NEWCOMB 
East Central State Teachers College, Ada, Oklahoma 


Since efficiency in all subsequent arithmetic work is based largely 
upon proficiency in the fundamental processes, it is desirable that pupils 
attain a reasonable standard of speed and accuracy in these processes. 
As a result a number of methods of drill have been formulated and are 
used by teachers in an attempt to bring about higher standards of 
achievement in this particular. Many of these methods of drill, 
however, have been formulated without due regard to the results of 
psychological experimentation in connection with the subject matter 
of arithmetic and the laws of habit formation, with the result that the 
final achievement of pupils has not always been satisfactory. 

A survey of the drill material provided in textbooks in arithmetic 
reveals the fact that it falls far short of being sufficient in amount and 
is neither systematic nor proportionate. According to Dr. Thorndike,! 
an examination of textbooks in elementary arithmetic shows that 
drill is provided in abundance for some of the elementary bonds, but 
very little and in some instances none at all for the bonds of higher 
decades. He holds that ‘‘the task of computation is not over when 
the child learns the addition combinations to 9 plus 9, the subtraction 
combinations to 18 minus 9, the multiplication combinations to 9 
times 9 and the division combinations to 81 divided by 9. This 
conception has caused much trouble. The table bonds do not make 
up one-fourth of the necessary bonds.’’ The pupil needs drill upon 
combinations with higher decades such as 7 plus 56, 81 minus 8, etc., 
if proficiency in computation is to be acquired. 

For the purpose of determining a more effective drill exercise and 
of verifying certain of these psychological theories, a series of drill 
exercises was prepared by the writer and with the assistance of a num- 
ber of teachers was given to the pupils in several elementary grade 
classes. ‘These exercises provide practice upon the combinations of 
each humber from 1 to 10 with each number from 0 to 100. The 
exercises are arranged upon cards 3 by 6% inches in size. There are 
20 cards for each of the four fundamental operations or a total of 80 
exercise cards in the series. Each card contains about 50 examples 

1 Thorndike, Edward L.: The Psychology of Arithmetic. The Macmillan Co., 


N. Y., 1922, pp. 126-127 and p. 106. 
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and the entire series comprises about 4000 drill examples. A series of 
50 numbers is arranged near the edges of each card and in the center of 
the card appears the number which is to be added to, subtracted from, 
etc., each of the numbers around the edge. In actual practice pupils 
seldom refer to this number more than once or twice but retain it in 
mind as they proceed with the various combinations around the card. 
When the pupil is ready to begin work the card is placed upon a sheet 
of tablet paper and the answers written upon this paper near the edge 
of the card. Results may be checked rapidly by the pupil by turning 
the card over and comparing his results with the answers which appear 
on the reverse side. 

This series of drill exercises was used by a number of elementary 
school classes during the latter part of the school year 1922-1923 
and was found very effective. In order to determine exactly the 
improvement resulting from the use of the series, records were kept of 
the work in addition, subtraction and multiplication made by the 
pupils of two Grade VII classes. Actual drill by the pupils of these 
classes in each of the processes consisted of five or six minutes every 
day over a period of 35 school days. At the end of this time the 
majority of the pupils had completed the 20 drill cards comprising the 
exercises of one process. In each case before the drill exercises were 
begun and after they were completed the Courtis Standard Research 
Tests in Arithmetic, Series (B), were given. To serve as a further 
check the Courtis Tests were also given on the same dates to another 
Grade VII class. The teacher and pupils of this class knew nothing of 
the drill exercises and the class was conducted in the usual manner. 
Practically the same kind and amount of work was done by the pupils 
of all three classes during the intervening period. The scholastic 
preparation of the teachers was not the same but all were successful 
teachers of several years’ experience. A comparison of the intelligence 
quotients of the pupils of the several classes did not reveal on the whole 
any appreciable differences. 

The drill classes with one exception show an appreciable percentage 
of improvement in both speed and accuracy in all of the processes. 
The highest percentage of improvement in speed of 26.9 per cent was 
made by Class C in addition, and the highest percentage of improve- 
ment in accuracy of 51.3 per cent was made by Class A in addition. 
The only improvement made by the non-drill class was 7.8 per cent 
in speed in addition. No improvement or loss was made by this class 





in | 
an 








ul 


Drill Exercise in Arithmetic 129 



































in speed or accuracy in multiplication, but a decided loss in both speed | E 
and accuracy in subtraction and in accuracy in addition is noted. 
: 
T 
ABLE [| i 
Showing percentage of gain or loss in median speed and accuracy scores. ‘ 
A and C were the drill classes, B the non-drill class. te 
| : \ 
Class Number of pupils Per cent gain or loss Per cent gain or loss | 
in median speed in median accuracy 
Addition 
A 21 22.8 51.3 
B 21 7.8 —3.7 
C 30 26.9 20.8 
Subtraction 
A 21 8.2 0.0 
B 21 —3.5 —19.9 
C 3 12.6 6.9 
Multiplication 
A 22 4.4 34.1 
B 21 0.0 0.0 
C 34 9.3 5.3 
TaBLe II 
Percentage of pupils in the drill and non-drill classes increasing in speed and 
accuracy. 
. Per cent Per cent . wer oom , 
Class Number of increasing increasing a Genie dae 
pupils : both speed and 
in speed in accuracy 
accuracy 
Addition 
A and C 51 70.7 68 .6 49.0 
B 21 38.1 57.1 19.1 
Subtraction 
A and C 54 50 55.6 29.6 
B 21 19.1 28.6 0.0 . 
Multiplication 
A and C 56 51.8 55.4 35.7 
B 21 42.9 42.9 19.1 
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A study of the several individual scores made by the pupils in each 
of the classes reveals some very interesting facts. In accuracy in 
addition, for instance, the drill classes made a total average gain per 
pupil of 12.9 points while the non-drill class made a total average gain 
per pupil of 2.5 points. In other words the drill classes gained 5.2 
times as many points per pupil in accuracy as the non-drill class. In 
accuracy in subtraction the drill classes made a total average per pupil 
gain of 4.3 points while the non-drill class actually lost 9.9 points per 
pupil. In accuracy in multiplication the drill classes made an aver- 
age gain of 9.1 points per pupil while the non-drill class lost 2.3 points 
per pupil. The drill classes made an average per pupil gain of 1.59 
points in speed in addition while the non-drill class made a total aver- 
age per pupil gain of .09 of a point. Acomparison of theimprovements 
made by the classes in this case shows that the drill classes gained 17.7 
times the amount gained by the non-drill class. In subtraction and 
multiplication the respective gains in speed per pupil in the drill classes 
were .39 and 1.33 points, with losses per pupil in each case of .77 and 
.24 of a point by the non-drill class. 

A comparison of the percentages of improvement made by the 
pupils in the drill classes in this experiment with those of pupils of the 
same grade in similar experiments in the past affords a very desirable 
means of estimating the real worth of the proposed drill exercises. 
J. C. Brown! in 1912 in a similar experiment, with the exception that 
the drill exercises were altogether different, found that the percentage 
of improvement in speed in addition in the case of 51 pupils in Grades 
VI, VII and VIII was 18.5 per cent. Mary A. Kerr? in 1916 found 
after an almost identical experiment, except that the drill exercises used 
were of an entirely different nature, that in addition VIIB pupils im- 
proved 11.3 per cent in speed and 25 per cent in accuracy, and that VIIA 
pupils improved 20.4 per cent in speed and 14.5 per cent in accuracy. 
The average percentages of improvement in speed and accuracy in 
addition made by pupils in the drill classes reported in this article, 
25.2 per cent and 33.4 per cent respectively, are considerably in excess 
of the improvements reported by the two authors above. Further 
percentages of improvement reported by Brown in his experiment could 





1 Brown, J. C.: An Investigation on the Value of Drill Work in the 
Fundamental Operations in Arithmetic. Journal of Educational Psychology, 
No. 2, 81-88; No. 3, 561-570, 1913. 

*Kerr, Mary A.: The Effects of Six Weeks’ Daily Drill in Addition. Indiana 
University Studies, 1916, Reported by Daniel Starch. 
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not be used in comparison because in the experiment from which 
data have just been quoted drill in subtraction and multiplication 
occurred at the same time as drill in addition, while in the experiment 
described in this paper drill in subtraction followed immediately 
after drill in addition, and drill in multiplication followed that upon 
subtraction. 

That the right kind and amount of drill upon the fundamentals 
results in an improvement in speed and accuracy in the fundamentals 
and also in the solution of reasoning problems as well, has been the 
common knowledge of educators for a number of years. While the 
results of this study tend further to confirm this idea, yet it is the opin- 
ion of the writer that the significant points of value revealed are, 
(1) that systematic and proportionate drill upon combinations with 
higher decades, so universally neglected, as well as upon the funda- 
mental combinations affords a type of drill which will yield excellent 
results; (2) that drill upon combinations of “‘seen numbers” with 
“thought numbers”’ is a valuable method of drill, especially in prepa- 
ration for column addition; (3) that special drill upon difficult combina- 
tions or an over-drill upon certain ones, is to be desired and (4) that 
the particular method of drill used, while not all sufficient is econom- 
ical of time and effort for both teacher and pupils in providing the 
type of drill indicated above and in bringing about a realization of 
desired results. 
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NOTES ON ARTICLES IN EDUCATIONAL 
PSYCHOLOGY IN CURRENT ISSUES OF 


a OTHER MAGAZINES ~~ 


REPORTED BY C. 0. MATHEWS 











ACHIEVEMENT TESTING 


A Comparative Study of Certain Silent Reading Tests. Sister M. Kathleen. 
The Catholic Educational Review, 1924, December, 589-595. A study of the 
objectivity, validity and general usefulness of seven widely used tests. 

The New Type of Examination. T. C. Tressler. The English Journal, 1924, 
December, 709-715. Illustrations of the new type of examination in the English 
field with a summary of advantages. 

Tests and Measurements in the Public Schools. F. L. Cordozo. School and 
Society, 1924, December 20, 797-798. A list of state and city bureaus of standards 
and measurements or bureaus of research with yearly appropriations. 

Chronological Age as a Factor in Reading Achievement. Fowler D. Brooks. 
School and Society, 1924, December 27, 826-828. Chronological age has very 
little influence on the reading ability of pupils within a grade. 

A Study Outline Test. F. Dean McClusky and Edward William Dolch. The 
School Review, 1924, December, 757-772. A preliminary report of the con- 
struction of a test to measure and diagnose the ability of students to outline 
printed materials. 

LEARNING 


Note on Building Likes and Dislikes in Children. Fred A. Moss. Journal of 
Experimental Psychology, 1924, December, 475-478. Shows how conditioned 
responses are built up by experimenting with two children, age two and age four. 

Learning in the Case of Three Dissimilar Mental Functions. Fowler D. Brooks. 
Journal of Experimental Psychology, 1924, December, 462-468. Results of a 
learning experiment in cancellation, mental multiplication and inverted writing. 

The Curve of Learning in Typesetting. Chalice M. Kelley and H. A. Carr. 
Journal of Experimental Psychology, 1924, December, 447-455. A curve of 
learning based upon the entire work record of the subjects under actual trade 
school conditions. 

Practice in the Fundamentals of Arithmetic. Worth J. Osburn. Journal of 
Educational Research, 1924, December, 356-363. An analytic study of a widely 
used text in primary arithmetic to determine amount and character of practice 
afforded. Also a report of practice provided in long division by six widely used 
texts. 

The Dependence of Learning and Recall upon Prior Mental and Physical Con- 
ditions. Paul L. Whitley. Journal of Experimental Psychology, 1924, December, 
420-428. An attempt to test the assumption that short fluctuations in learning 
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are due to prior physical and mental conditions. Results by groups fail to show 
this though some individuals were affected. 

Correlation of Auditory Digit Memory Span with General Intelligence. Arthur 
S. Clark. The Psychological Clinic, 1924, November, 259-260. This study 
shows that memory span and general intelligence as measured by the tests used are 
not correlated. 

Some Memory Span Problems—An Analytic Study at the College Adult Level. 
R. A. Brotemarkle. The Psychological Clinic, 1924, November, 229-258. The 
diagnostic value of the memory span and its relation to the complexity of mental 
organization. , 

Effects of Interval on Recall. Warner Brown. Journal of Experimental 
Psychology, 1924, December, 469-474. An experiment in the recall of 48 words 
immediately, after 8 and 16 minutes and from three to seven days. 

The Influence of Competition on Performances An Experimental Study. Irving 
V. Whittemore. The Journal of Abnormal Psychology and Social Psychology, 
1924, October-December, 236-253. Experiments in group and individual perfor- 
mance under competition. In each case under competition the quantity of pro- 
duction increases while the quality decreases. 


CHARACTER AND PERSONALITY 


Organized Personnel Research and Its Bearing on High-school Problems. W. 
Hardin Hughes. Journal of Educational Research, 1924, December, 386-398. 
The applications of ratings of pupils’ capacities, attitudes and interests and their 
relation to achievement and mental ability. 

Five Factors in the Teaching of Ideals. W. W. Charters. The Elementary 
School Journal, 1924, December, 264-276. The. five factors are: ‘“‘Create the 
desire, diagnose the situation, develop a plan of action, require practice, and 
generalize the ideal.”’ 

On the Loss of Reliability in Ratings Due to Coarseness of the Scale. Percival M. 
Symonds. Journal of Experimental Psychology, 1924, December, 456-461. A 
statistical proof that the best number of class intervals in rating scales for person- 
ality traits is seven. 


MISCELLANEOUS 


Certain Neglected Social Institutions. Charles H. Judd. The Elementary 
School Journal, 1924, December, 254-263. The third of a series of articles on 
educational psychology. Discusses the social importance of thrift and weights 
and measures. 

Scientific Study of Visual Education. Frank N. Freeman. Journal of Edu- 
cational Research, 1924, December, 375-385. A summary of the investigation of 
various forms and methods of visual education with special emphasis on moving 
pictures. 

Methods of Study Used by College Women. Jessie Allen Charters. Journal of 
Educational Research, 1924, December, 344-355. An attempt by personal inter- 
views to answer the question, “‘How do students naturally study?” 

A Statistical Study of Usage and of Children’s Errors in Capitalization. S. L. 
Pressey. The English Journal, 1924, December, 727-732. A study of the 
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frequency of use and of mistakes of capitalization with suggestions regarding its 
teaching. 

The Possibilities and Limitations of Training. Lewis M. Terman. Journal of 
Educational Research, 1924, December, 335-343. The announcement of a pro- 
posed study to determine the relative contributions of heredity and training to our 
educational product. 

A Study of Estimates of Intelligence from Photographs. Donald A. Laird and 
Herman Remmers. Journal of Experimental Psychology, 1924, December, 429- 
446. An experimental study with a historical sketch of ability to judge intelligence 
from photographs. The individuals attempting to arrange the pictures of 10 
persons according to intelligence could have done as well with their eyes closed as 
with them open. 
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CONDUCTED BY JOHN HOCKETT 
The Lincoln School of Teachers College 


A Stupy or YounGc CHILDREN 


Psychology of the Pre-school Child, by Bird T. Baldwin and Lorle 
I. Stecher. New York: D. Appleton & Co., 1924. Pp. 305. 


Psychologists, teachers and others who are professionally inter- 
ested in the welfare of young children are confronted with the need for 
a child psychology based upon several types of controlled experiment. 
The authors of this book have followed two of the techniques neces- 
sary to the final development of the new psychology: Systematic obser- 
vation of children and the use of standard measurements. A three 
years’ study of 105 slightly superior pre-school children has resulted 
in the establishment of tentative norms of growth and behavior 
characteristic of the age levels from two to six years. Standards of 
physical and mental growth have been developed and criteria for 
normal social behavior and educational progress are suggested. 

The style of the book is non-technical but test results are so dis- 
cussed as to offer interesting and illminating research materials to 
the scientific worker. The method of partial correlation is used in the 
interpretation of relationships as determined by mental and physical 
tests. Physical measurements are considered in their interrelations 
and also in relation to standing on mental test. The authors conclude 
that physical growth is fairly uniform in all of the 16 traits measured 
and that there is a slight positive relationship between height, weight, 
and mental age but little or no relationship between mental age and 
carpal area. In addition to the usual anthropometric measurements, 
numerous motor tests have been adapted and new ones developed. 
Test procedure is so presented that the teacher may use the book as a 
manual for giving and interpreting the tests. 

In their standardization of mental tests for very young children 
the authors have made use of well known performance materials and 
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have established new norms for the lower age levels. An interesting 
study of learning is described by means of a card sorting test. Some 
new mental tests are presented with the suggestion that they be 
adapted to group use. Under the discussion of each test there is a 
brief mention of the characteristic behavior of the children tested. 
This phase of the study should be most helpful in the further develop- 
ment of test materials. Combined results indicate that mental tests 
can be satisfactorily adapted to very little children although the 
Binet IQ rating is not extremely stable in the case of the young child. 
The authors found both increase and decrease in repetition of the 
Binet Test. 

The latter section of the book is devoted to a discussion of pre- 
school equipment and procedure. There are seven appendices giving 
references to books, stories and educational materials. While the 
conclusions reached in this section are not so susceptible to scientific 
proof as deductions based upon quantitative measurements, those 
interested in the education of little children should find the study very 
suggestive and stimulating to research in problems of the pre-school 
curriculum. 

In presenting cross sections of child behavior :a many and varied 
situations the authors have made a valuable contribution to the 
psychology of the normal child. Furthermore, the results furnish a 
necessary basis for further intensive and specific analysis of narrower 
problems of child behavior by means of laboratory experiments. 

Bess V. CUNNINGHAM. 


A LABORATORY ym FOR EDUCATIONAL PSYCHOLOGY 


Laboratory Studies in Educational Psychology, by Egbert Milton 
Turner and George Herbert Betts. New York: D. Appleton & 
Co., 1924. Pp. IX + 2138. 


Turner and Betts have done a truly serviceable piece of work in 
placing such a book as this one at the disposal of teachers of Psy- 
chology. For some years we have needed brought together a collec- 
tion of experiments which would be really usable in normal school and 
university classes for illustrating the high points of lecture work or 
class work of a more informal nature. Turner and Betts have not 
only collected a set of well chosen experiments, but have made them 
available in excellent teaching form. Most of the experiments are 
prefaced with a brief theoretical summary of the topic to be handled. 
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The directions for conducting the experiment are on the whole clear 
and concise. The questions at the end of each experiment are of a 
type that will stimulate intelligent organization of the material 
gathered. 

Unless one is teaching primarily by the laboratory method, how- 
ever, the book contains more material than could be used in a single 
academic course. There are many experiments in the field of sensa- 
tion which would not ordinarily be included in a course in Educational 
Psychology. On the other hand there are a number of experiments in 
the field of intelligence and achievement measurement which field 
is coming in many normal schools to be increasingly recognized as a 
part of Educational Psychology courses. In addition to these there 
are standard experiments in the classical sub-headings of General 
Psychology: “‘Attention,” ‘‘Memory,” ‘‘Imagination,”’ ‘“‘ Associa- 
tion,” “Conception and Judgment,” “‘ Reasoning,”’ “ Feeling,”’ ‘‘ Emo- 
tion;’’ and still others which savour of a coming standardization of 
subheadings in Educational Psychology: ‘The Accomplishment Quo- 
tient,” ‘“Comparison of Groups,” “ Doctrine of Formal Discipline— 
The Transfer of Training;”’ and finally, that division without which 
no education book of to-day is complete: “Statistical Treatment 
of Educational Measures.”’ LEONA VINCENT. 





AN ENGLISH VIEW OF FREUDIANISM 


Modern Theories of the Unconscious, by W. L. Northridge. New York: 
E. P. Dutton & Co., 1924. Pp. XV + 193. 


This is an attractive presentation of the topic ‘‘ Modern Theories of 
the Unconscious.”’ Itis written frbm the English and from a somewhat 
pro-Freudian viewpoint. The style is pleasing; the material is well 
organized and is presented in a very readable and semi-popular style. 
Dr. Northridge has used a somewhat chronological sequence of topics 
and has traced, more mechanically perhaps than really can be justified, 
the origin and development of the later Freudian theories in which he 
is particularly interested. Among the topics stressed are (a) the 
early theories together with the philosophical conceptions of their 
authors; (b) the theory of Myers which arose out of his study as a 
member of the Society for Psychial Research; (c) theories of the sub- 
conscious including those of Sidis, Prince, Janet, and others; (d) 
psycho-analytic theories; and (e) the theories of Jung, Adler, and of 
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“Freud and his school as the most important of the theories of the 
unconscious.”’ In conclusion he indicates certain restrictions which he 
considers should surround the concept and statement of Freud’s 
theories whose value lies chiefly in providing a new viewpoint from 
which to study further the phenomena concerned. This should be 
emphasized, for as Mr. J. Laird who is author of the introduction points 
out, it would be unfair to Dr. Northridge to assume that he believes a 
final solution has been found. 

The straightforward and rather brilliant treatment of the subject is 
distinctly appealing and so may be accepted uncritically by laymen. 
Confusion in terminology and the unstandardized use of words makes 
an adequate treatment difficult. The final conclusion, however, is 
not convincing to those who question the doubtful assumptions 
underlying Freudian doctrines. However, the work stands as an 
interesting and a valuable contribution in its honest attempt to formu- 
late a concise statement of the theories of those who have taken such 
prominent parts in the pioneer work on this very interesting, but yet 
unsolved, problem. Epwin Maurice BaILor. 


A MysticaL VIEW or PsycHOLOGY 


Life and Word: An Essay in Psychology, by R. E. Lloyd. Longmans, 
Green & Co., London, 1924. Pp. 139. 


An analytical criticism of this little book would necessitate a dis- 
cussion of Idealism, Vitalism and Pragmatism; for the author takes his 
cues from Kant, Bergson and James. Obviously, this cannot be done 
in a short review. ! 

Showing the influence of Kant, shis book is much concerned with 
‘“man’s psychic unity” which the author upholds. Substituting for 
Bergson’s “intellect”? the term ‘‘ verbal-thought,”’ which he describes 
as the quality of humanity belonging to the people of the world (but 
which does not mean the people of the world) he finds the three constitu- 
ents of humanity to be: (1) Obedience—the ‘“‘ categorical imperative” 
of Kant, (2) generosity—the feeling of resemblance, and (3) faith—the 
sense of mystery. The thesis of the book is: “Thought is not taken 
from things but taken mysteriously—’”’ Kant’s “‘ Dinge an sich.” 

Those of us who believe with Robinson and Dewey that there is 
no immaculate conception of ideas and that “reason pure of all 
influence of experience is a fiction”’ will consider this book an essay 
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in Metaphysics rather than Psychology. For the author insists that 
“Life is a mystery which cannot be understood” (p. 58) and maintains 
that ‘‘The inventive impulse of man is wrapt in the same mystery as 
the creative impulse of life’”’ (p. 81). 

Mysticism? Yes—but of interest to students of philosophy and 
others who like to ‘‘follow the argument.” H. ME.urTzer. 


A Source Book FOR THE CURRICULUM MAKER 


The Education of the Consumer, by Henry Harap. New York: The 
Macmillan Co., 1924. Pp. XXII + 360. 


Illustrating the trend toward more scientific curriculum construc- 
tion this book offers the curriculum specialist, the teacher and the 
administrator a wealth of data concerning the consumption of the 
American people of food, clothing, fuel and shelter. Evidence, 
statistical and otherwise, bearing upon the habits of our people which 
relate to the consumption of economic goods has been collected from 
many sources and is presented in a concise, classified and interesting 
form. ‘Tables of relative values of food, fuel, clothing and building 
materials for designated purposes are given to guide the curriculum 
maker in the presentation of facts. Wasteful practices of our people 
due chiefly to ignorance are pointed out, and comparisons are made 
with standards of efficient consumption insofar as the latter are 
available. In connection with the discussion of each topic the objec- 
tives of education desirable to maintain or increase wise use of the 
material things of life are formulated. These objectives are repro- 
duced in the final chapter, arranged in classified form and given in 
sufficient detail to be of direct aid in the formulation of curriculum 
materials. This determination of objectives is, in fact, given as the 
specific aim of the study. Nevertheless, the curriculum maker will 
find the presentation of factual material of marked utility. The lay- 
man will find,the book of value as a consumer’s handbook. A classified 
bibliography is appended. 

This book represents an important contribution to the movement 
which has been developing during the past decade to remove the deter- 
mination of objectives and content of curricula from the field of 
individual judgment and base it upon more reliable and more objective 
measures. ‘The scope of the study is not limited by the range of any 
school subject, but is rather determined by an important phase of 
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life’s activities. Contributions are made to the school subjects of 
household and industrial arts, social studies, science, arithmetic, health 
and hygiene. J. H. 





PRINCIPLES AND AIMS OF SUPERVISION 


Educational Supervision, by Charles Edgar Scott. Milwaukee: The 
Bruce Publishing Co., 1924. Pp. 98. 


While this little book ‘“‘is designed for principals, supervisors, and 
superintendents, and for use in classes in supervision in teacher-training 
institutions,” it is not intended in any sense as a handbook of super- 
vision. ‘The author “has aimed primarily at a clearer, more definite 
and concise statement of the underlying principles and guiding aims 
of supervision.”” The brevity of the treatment has led to a failure to 
consider adequately the underlying principles involved, and to limit 
attention to what are considered the three most important aims of 
supervision: Insuring continuity in the child’s educational program; 
securing economy in the child’s educational progress; developing teach- 
ing ability, with intelligent and masterly self-direction as the ultimate 
purpose. Most of the book is devoted to the development of these 
aims, with a final chapter on ‘‘Supervision in Operation.” 

The author defines supervision as ‘‘ that form of school management 
which has as its function the coordination, stimulation, and direction 
of instruction,”’ and conceives the supervisor to be primarily a teacher 
of teachers. He assigns to the supervisor the responsibility for 
inspectional activity, and demands that he be expert, not only in the 
supervision of instruction, but in the administration of tests and meas- 
urements, and in educational and vocational guidance as well, where 
there are not others to assume these responsibilities under his 
direction. | 

The author recognizes the need for democracy in supervision— 
an excellent example of it is given in connection with the revision of 
the course of study—yet insists, in essence, that democracy applies 
as between equals and experts. He sees the welfare of the child as the 
end product of supervision, and the supervisor as being responsible, 
through conditioning the content, materials, and methods of instruc- 
tion, for the achievement of that end product. It is at this point that 
proponents of greater freedom and self-expression for the teacher will 
take issue with him most strongly. 
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There is little that is new in this book, yet it is interesting because 
of its conciseness, the vigor with which the author presents his views, 
and the challenge which it offers to some of the current concepts in 
supervision. BENJAMIN R. SHOWALTER. 


METHODS 


Progressive Methods of Teaching, by Martin J. Stormzand. Boston: 
Houghton Mifflin Co., 1924. Pp. XII + 375. 


This is not a listing of devices. Asa treatment, however, of general 
methods in elementary and secondary education it is remarkably 
satisfying, because it springs from copious, sanely proportioned, and 
well coordinated knowledge of origins, underlying philosophies, related 
psychology, scientific investigations, and current practices. Its pages 
speak with authority, and, let it be noted, in a language and style 
calculated to convey meaning readily and pleasantly. The book 
should stimulate that kind of interest in method, which results in 
experiment and in the creation of working devices. 

The chapters deal with textbooks, collateral materials, study and 
its supervision, project and problem, type lessons, drill, the socialized 
recitation, lesson planning, and individualized instruction. With 
a few exceptions, such as the discussions of drill, reviews, and lesson 
planning, the presentation of subjects is fresh in organization and 
spirit. Analyses are clear, evaluations are referred as much as possible 
to scientific and objective criteria, definitions are comprehensive yet 
simply phrased, illustrations and models are practical. Anyone 
familiar with the field would be likely to agree that the principles and 
prescriptions approved here are the best and safest now possible to 
formulate. 

As special instances of the author’s clarity of thought and skill 
in composition, may be cited his treatments of the Herbartian “‘steps,”’ 
thinking, interest, project and problem, and appreciation. The 
volume would seem to be most valuable as a text in educational classes 
of graduate level. It is well equipped with teaching and learning guides. 
At the beginnings of the chapters there are excellent outlines of the 
following discussions, and at the ends there are tests and “study- 
guides” in the form of enumeration, completion, and right-wrong 
exercises. ‘Parallel readings’ are suggested for each major topic 
treated. M. H. WItiIna. 
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CORRECTION 


Dr. H. B. Reed states in this Journal, Vol. XV, p. 595: “Brown 
does not give his number of subjects.’’ Reference to my paper, 
‘Whole or Part Methods in Learning,” this Journal, Vol. XV, p. 
232, in the only table published, shows that I used 4 groups of sub- 
jects, numbering respectively 166, 83, 142, 124. 

WARNER Brown. 





Psychological Tests of Educable Capacity, Report of Consultative 
Committee of the Board of Education, 1924, London, England, which 
was reviewed in the December issue may be obtained from The 
British Library of Information, 8th Floor, 44 Whitehall Street, 
New York. 
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